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BIOLOGICALLY ACTIVE FRAGMENTS OF 
THERMUS FLAWS DNA POLYMERASE 

BACKGROUND OF THE INVENTION 
A. Field of the Invention 

5 The present invention relates to an isolated and purified DNA 

that encodes a thermostable DNA polymerase. Additionally, the present 
invention relates to a recombinant and thermostable DNA polymerase and to 
fragments thereof, all having enhanced polymerase activity, and to methods 
for producing the DNA polymerase and fragments. The present invention 
10 further relates to recombinant fragments having decreased exonuclease 
activity. The thermostable recombinant polymerases of the present invention 
are useful because they are capable of providing enhanced polymerase activity 
in bio-applications, such as in the polymerase chain reaction (PCR), in DNA 
amplification and in thermal cycle labeling (TCL). 

15 B. Background 

The burgeoning field of biotechnology was revolutionized by 
recombinant DNA technology, and DNA polymerase enzymes are an 
indispensable tool used in many modern molecular in vitro recombinant DNA 
biological applications, such as in DNA sequencing; DNA cycle sequencing; 

20 Polymerase Chain Reaction (PCR) and its many variations (see, e.g., Erlich 

et al, Current Communicatio ns in Molecular Biology: PQlymera3e Chain 

Reaction. Cold Spring Harbor Press, Cold Spring Harbor (1989); Innis et al. , 
PCR protocols: A guide to methods and applications. Academic Press, San 
Diego (1990)); Thermal Cycle Labeling (TCL) (Mead and Swaminathan, U.S. 

25 Patent App. Ser. No. 08/217,459, filed March 24, 1994; PCT App. No. 
US94\03246, filed March 24, 1994); Random Primer Labeling (RPL) ; Ligase 
Chain Reaction (LCR) (Wiedmann et al., PCR Methods and Applications 3: 
S51-S64 (1994)); and other applications. 
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To date scientists have reported more than 40 different DNA 
polymerases, and have reported DNA sequence information for some DNA 
polymerase genes. Amino acid sequence information has been deduced from 
the reported genes, and comparison of amino acid sequences has resulted in 
5 the placement of reported polymerase genes into four major families: namely, 
A, B, C, and X. Family A contains E. coli DNA polymerase I, an enzyme 
involved in repair of DNA and in replication during fast growth. Family B 
includes E. coli DNA polymerase II. Family C polymerases include E. coli 
DNA polymerase III, the major replication enzyme. The fourth group, 

10 Family X, contains enzymes such as the eukaryotic DNA polymerase 0 and 
eukaryotic terminal transferases (Ito and Braithwaite, Nucleic Acids Res. 19: 
4045-4057 (1991)). The breakdown of DNA polymerases into families has 
been helpful for the understanding of fundamental biological processes and for 
the selection of enzymes for particular molecular biological applications. 

15 DNA polymerase I (pol I) (Family A) enzymes have proved to 

be very useful for DNA sequencing applications, PCR, TCL, and other 
applications known in the art. Structure-function relationship studies indicate 
that known DNA pol I molecules share a similar modular organization. A 5 ' 
■ -* 3' exonuclease function is located in the N-terminai one-third of the 

20 enzyme. The remainder of the molecule forms one domain which is further 
classified into functional sub-domains. Adjacent to the 5 ' -* 3 ' exonuclease 
domain lies a 3 ' -* 5 ' exonuclease sub-domain, followed by a polymerase 
sub-domain (Blanco et al., Gene 700:27-38 (1991)). 

In addition to classifying DNA polymerase enzymes into the 

25 above families, it is also useful to classify such polymerases as mesophilic 
(purified from mesophilic organisms) or thermophilic (purified from 
thermophilic organisms) in origin. DNA polymerases of mesophilic 
organisms were discovered earlier and have been more extensively studied 
than their thermophilic counterparts. As early as the 1950's, isolation and 

30 purification protocols for DNA polymerase I from mesophilic bacteria (e.g., 
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E. coli) and some of their phages were developed and have since been 

modified. See, e.g., Bessman et al., J. Biol. Chem. 233:111-111 (1958); 

Buttin and Romberg, /. Biol. Chem. 241: 54 19-5427 (1966). The DNA 

polymerases studied most extensively are the DNA polymerase I enzymes 
5 isolated from E. coli and the bacteriophage T7 DNA polymerase. 

The DNA polymerases of mesophilic origin are useful in many 

biological applications, such as in certain DNA sequencing applications. 

However, many important applications (e.g., polymerase chain reaction (PCR) 

applications and thermal cycle labeling (TCL)) require thermal cycling to 
10 repeatedly denature template DNA and/or RNA and their extension products. 

Because mesophilic DNA polymerases do not withstand the high temperatures 

or thermal cycling of these applications, thermostable DNA polymerases enjoy 

significant advantages over mesophilic DNA polymerases in such applications. 

The discovery and study of such thermostable DNA 
15 polymerases ~ from thermophilic bacteria ~ has been a much more recent 

phenomenon. See, e.g., Uemori et al., /. Biochem. 113: 401-410 (1993); 

Uemori et al., Nucleic Acids Res. 21: 259-265 (1993)); Lawyer et al. J. Biol. 

Chem. 264: 6427-6437 (1989); Kaledin et al., Biokhimiya 45:644-651 (1980); 

Chien et al., J. Bacteriol. 727:1550-1557 (1976); Gelfand et al, U.S. Patent 
20 Nos. 4,889,818 and 5,079,352; Burke et al., U.S. Patent No. 5,108,892. 

Perhaps the best- studied thermostable DNA polymerase, derived 

from Thermits aquaticus, is called Taq pol I. A number of routes have been 

taken in attempts to clone the Taq DNA pol I gene. (See, e.g., Lawyer et al. 

(1989); Gelfand et al., U.S. Patent No. 5,079,352 (1992) (purification to 
25 approx. 200,000 units/mg reported); Lawyer et al., PCR Methods and 

Applications 2:275-287 (1993) (purification to 292,000 units/mg reported); 

Engelke et al., Anal. Biochem. 797:396-400 (1990); Sagner et al.. Gene 

97:119-123 (1991)). 
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As explained above, in addition to possessing useful DNA 
polymerase activity, a number of DNA polymerase I holoenzymes possess 
exonuclease activities, which for many biological applications are undesirable. 
Therefore, modified DNA polymerase enzymes having reduced exonuclease 
5 activities are desirable. Through deletion of the 5 ' one-third of DNA 
polymerase I genes, or the proteolytic cleavage and subsequent removal of the 
portion of the holoenzyme encoded thereby, scientists have created DNA pol 
I fragments retaining polymerizing activity, but having reduced 5 ' -* 3 ' 
exonuclease activity. (See, e.g., Joyce and Grindley, Proc. Natl. Acad. Sci. 

10 80: 1830-1834 (1983) (the Klenow- Fragment of the E. coli DNA polymerase 
enzyme); Lawyer et al. , J. Biol. Chem. 264:6421-6437 (1989), Gelfand et al. , 
U.S. Patent No. 5,079,352 (1992), Lawyer et al., PCR Methods and 
Applications 2:275-287 (1993) (the Stoffel fragment of the T. aquaticus (Taq) 
DNA polymerase enzyme, reportedly purified to a specific activity of 369,000 

15 units/mg); and Barnes, Gene 772:29-35 (1992) (the KlenTaq DNA 
polymerase).) 

In addition to Taq DNA polymerases, other thermophilic DNA 
polymerases reportedly have been cloned and expressed in E. coli. Uemori 
et al. reportedly expressed DNA polymerases from Bacillus caldotenax (J. 

20 Biochem. 775:401-410 (1993)) and Pyrococcus furiosus {Nucleic Acids Res. 
27:259-265 (1993)). 

DNA polymerases from other bacteria of the genus Thermus 
have been reported. A method of recovering a thermostable DNA polymerase 
from cultured Thermus thermophilus is reported in U.S. Patent No. 5,242,818 

25 to Oshima et al. (1993). The purported purification of native Thermus flavus 
DNA polymerase with an apparent molecular weight of 66,000 daltons was 
described by Kaledin et al., Biokhimiya 46:1576-1584 (1981)). In one 
application, Kainz et al., Anal. Biochem. 202:46-49 (1992), reported the 
amplification of a 10.9 kb fragment and a 15.6 kb fragment from phage 
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lambda DNA with Hot Tub (T. Jlavus) polymerase (Amersham, Arlington 
Heights, IL), but not with Taq polymerases. The rapid filter assay of Sagner 
et aL, Gene 97: 119-123 (1991) has allowed Akhmetzjanov and Vakhitov to 
identify a purported T. Jlavus (strain and origin unidentified) DNA 
5 polymerase I gene and to determine the DNA sequence of this gene (Nucleic 
Acids Res. 20:5839 (1992)). There is no report of the expression of an active 
DNA polymerase encoded by the purported Thermus Jlavus DNA polymerase 
I (Tfl DNA pol I) gene characterized by Akhmetzjanov and Vakhitov. Native 
T. Jlavus (Tfl) DNA polymerase I is commercially available, e.g., from 

10 Molecular Biology Resources, Inc. (Milwaukee, WI, Catalogue #1112-01). 

The different reports of thermostable DNA polymerases and 
their derivatives suggest these enzymes possess different, unpredictable 
properties that may be advantageous or detrimental, depending on the 
biological application in which the DNA polymerase is to be employed. For 

15 example, Thermus thermophilus DNA polymerase ! was reported to have a 
significant reverse transcriptase activity. In the same reaction tube, in 
successive steps, the reverse transcriptase function allows the production of 
double stranded DNA from RNA and then the DNA polymerase function is 
used to amplify this cDNA. Myers and Gelfand, Biochemistry 50:7661-7666 

20 (1991). 

The KlenTaq DNA polymerase is an example of an enzyme 
fragment with important properties differing from the Taq holoenzyme. The 
KlenTaq DNA polymerase reportedly has a roughly two-fold lower PCR- 
induced relative mutation rate than Taq polymerase holoenzyme. However, 
25 more units of KlenTaq are needed to obtain PCR products similar to those 
generated with Taq DNA pol I. 

Similarly, Lawyer et al. (1993) reported that T. aquaticus DNA 
polymerase I fragments possessed greater thermostability and were active over 
a broader Mg ++ -range than the corresponding holoenzyme. Because of its 
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broader range of magnesium ion concentration, the Stoffel fragment has been 
used in multiplex PCR, where more than two primers must anneal to the 
template. The thermostability of the Stoffel fragment makes this enzyme a 
better choice when GC-rich templates are amplified. It is desirable to purify 
5 and isolate additional DNA polymerase enzymes and derivatives, to take 
advantage of the unique but unpredictable properties that such molecules may 
have. 

There remains a need in the art for new, thermostable DNA 
polymerase enzymes for use in the expanding universe of molecular biological 
10 applications. More particularly, there exists a need for thermostable DNA 
polymerase enzymes having high purity, high DNA polymerase specific 
activity, low levels of exonuclease activity, and possessing high fidelity (low 
mutation frequencies) and high processivity when used in DNA amplification 
protocols. 

15 An object of the present invention is to provide polymerase 

enzyme preparations of greater purity, quantity, DNA polymerase specific 
activity, and processivity than has heretofore been possible. A further object 
is to eliminate the need and expense of culturing of large volumes of 
thermophilic bacteria at high temperatures that is associated with preparing 

20 thermostable polymerase enzyme preparations. Yet another object is to 
provide a recombinant polymerase possessing reduced exonuclease activities, 
as compared to the currently available native holoenzyme. 

SUMMARY OF THE INVENTION 

The present invention relates to the cloning and expression of 
25 a gene encoding a thermostable DNA polymerase, the purification of a 
recombinant thermostable DNA polymerase encoded by the gene, and 
applications for using the polymerase. The gene of the Thermos flavus DNA 
polymerase I (Tfl DNA pol I), was cloned and expressed in Escherichia coli. 
The purified recombinant T. flavus DNA polymerase enzyme is shown to be 
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thermostable and have a molecular weight of about 90,000 to 100,000 daltons. 
The DNA sequence of the Tfl DNA pol I gene, including flanking sequences, 
was determined and the coding sequence of the recombinant enzyme was 
mapped within this gene. A Tfl DNA Pol I gene fragment also was expressed 
5 in E. coli, the purified recombinant protein products ("exo fragment") lacking 
274 and 275 amino acids from the N-terminus of the Tfl DNA pol 1 
holoenzyme. This Tfl exo" fragment has very low 3 ' -* 5 ' and 5 ' — 3 ' 
exonuclease activities. Numerous properties of and applications for the 
recombinant enzymes are described. 

10 In one aspect, this invention provides purified polynucleotides 

(e.g. DNA sequences and RNA transcripts thereof) encoding a thermostable 
polypeptide having DNA polymerase activity. Preferred DNAs include the 
Thermus flavus DNA pol I gene comprising nucleotides 301 to 2802 of SEQ 
ID NO: 1; the Thermus flavus DNA pol I exo' fragment gene comprising 

15 nucleotides 1 to 1791 of SEQ ID NO: 3; the DNA comprising nucleotides 
112-1791 of SEQ ID NO: 3; a portion of the insert of plasmid pTFLRT4 
(ATCC Accession No. 69633), said portion encoding a thermostable 
polypeptide having DNA polymerase activity; a portion of the insert of 
plasmid p21EHcMl.l, (ATCC Accession No. 69632), said portion encoding 

20 a thermostable polypeptide having DNA polymerase activity: fragments or 
portions of these DNAs that encode thermostable polypeptides having DNA 
polymerase activity; and variants of these DNAs that encode thermostable 
polypeptides having DNA polymerase activity. 

In another aspect, this invention provides DNA sequences such 

25 as those described above operatively linked to a promoter sequence, a cloning 
vector, an expression vector, or combinations thereof. 

In related aspects, the invention provides novel plasmids and 
vectors. For example, the invention provides a plasmid pTFLRT4, having 
ATCC Accession No. 69633; and a plasmid p21EHcMl.l, having ATCC 

30 Accession No. 69632. The invention also provides a vector that includes 
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nucleotides 301 to 2802 of SEQ ID NO:l, the nucleotides encoding a 
polypeptide having thermostable DNA polymerase activity; and a vector that 
includes nucleotides 112 to 1791 of SEQ ID No:3, the nucleotides encoding 
a polypeptide having thermostable DNA polymerase activity. 
5 In related aspects, the invention provides a vector having at 

least one insert consisting essentially of nucleotides 301 to 2802 of SEQ ID 
NO: 1, the nucleotides encoding a polypeptide having thermostable DNA 
polymerase activity. The invention further provides a vector having at least 
one insert consisting essentially of nucleotides 112 to 1791 of SEQ ID NO: 

10 3, the nucleotides encoding a polypeptide having thermostable DNA 
polymerase activity. 

The present invention is also directed to host cells, such as 
prokaryotic and eukaryotic cells, that have been stably transformed with 
DNAs vectors, or plasmids of the invention. Another aspect of the invention 

15 is directed to such transformed host cells that are capable of expressing a 
thermostable polypeptide encoded by the DNAs, the peptide having DNA 
polymerase activity. 

In another aspect, this invention provides purified thermostable 
polypeptides having DNA polymerase activity. Preferred peptides include a 

20 Thermus flavus DNA polymerase I holoenzyme substantially free of other 
Thermus flavus proteins; a polypeptide having the amino acid sequence of 
SEQ ID NO: 2; a fragment of a Thermus flavus DNA polymerase I 
holoenzyme, including a fragment with reduced exonuclease activity as 
compared to the holoenzyme, and also including a fragment having the amino 

25 acid residues 1-560 or 2-560 of the amino acid sequence shown in SEQ ID 
NO: 5; a fragment encoded by the insert of plasmid p21EHcMl.l, having 
ATCC Accession No. 69632; fragments of the above peptides that retain DNA 
polymerase activity; and variants of the above peptides that retain DNA 
polymerase activity. 
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In another aspect, this invention provides methods for purifying 
a thermostable polypeptide having DNA polymerase activity including the 
steps of transforming a host cell with a DNA of the present invention to create 
a transformed host cell; cultivating the transformed host cell under conditions 
5 that promote expression of a thermostable polypeptide encoded by the DNA, 
the polypeptide having DNA polymerase activity; and purifying the 
thermostable polypeptide with a monoclonal antibody that is cross-reactive 
with the thermostable polypeptide. In one preferred method, the cross- 
reactive monoclonal antibody has specificity for a Thermus aquaticus DNA 
10 polymerase and/or for a Thermus flavus DNA polymerase. 

In another preferred method, commercially available chromato- 
graphy columns are used to purify the expressed polypeptide. 

In another aspect, this invention provides methods of purifying 
a thermostable polypeptide having DNA polymerase activity. One such 
15 method includes the steps of expressing the thermostable polypeptide in a host 
cell, the polypeptide having an amino acid sequence encoded by a DNA of the 
present invention; lysing the cell to create a suspension containing the 
thermostable polypeptide, as well as host cell proteins and cell debris; 
contacting a soluble portion of the suspension with an antibody that is 
20 immunologically cross-reactive with the thermostable polypeptide under 
conditions wherein the antibody binds to the thermostable polypeptide to form 
an antibody-polypeptide complex; isolating the antibody-polypeptide complex; 
and separating the thermostable polypeptide from the isolated antibody- 
polypeptide complex to provide a purified thermostable polypeptide. 
25 Preferably, such a method further includes the steps of heating the suspension 
to denature the host cell proteins; and centrifuging the suspension to remove 
the cell debris and denatured host cell proteins. In more preferred methods, 
the immunologically cross-reactive antibody is a monoclonal antibody, such 
as a monoclonal antibody that is immunologically cross-reactive with Thermus 
30 aquaticus DNA polymerase I and/or Thermus flavus DNA polymerase I. This 
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preferred method is exemplified herein using the monoclonal antibody purified 
from a hybridoma designated hybridoma 7B12. 

In another aspect, this invention provides methods of using the 
DNA constructs of the invention to produce recombinant thermostable 
5 polypeptides having DNA polymerase activity. One such method involves 
using a DNA encoding a DNA polymerase enzyme to generate active 
fragments of the DNA polymerase enzyme, including the steps of: deleting a 
portion of the DNA to create a modified DNA; expressing the modified DNA 
to produce a DNA polymerase enzyme fragment; purifying the DNA 

1 0 polymerase enzyme fragment; assaying the DNA polymerase enzyme fragment 
for DNA polymerase activity; and selecting a DNA polymerase enzyme 
fragment having DNA polymerase activity; wherein the DNA is selected from 
among the DNAs described herein. 

In another aspect, this invention provides methods for using the 

15 proteins of the invention in biological applications, such as DNA sequencing; 
amplification of DNA and/or RNA sequences; polymerase chain reaction 
(PCR); thermal cycle labeling (TCL); universal thermal cycle labeling 
(UTCL); ligase chain reaction (LCR); and other applications or processes that 
would be apparent to those skilled in the art. 

20 In yet another aspect, this invention provides kits for using the 

proteins of the invention in various biological applications, such as kits for 
labeling DNA. 

BRIEF DESCR IPTION OF THE DRAWINGS 
FIGURES 1A and IB graphically depict the cloning strategy: 
25 ( 1 A) for the gene encoding the Tfl DNA pol I holoenzyme; and (IB) for the 
DNA encoding the exo" fragment of T. flavus DNA polymerase I. The 
abbreviations used are: B: BamHI, RI: EcoRI, RV: EcoRV, He: Hindi, P 
lacZ: promoter of the lacZ gene, S: Sail, and X: Xbal. Jagged lines ( 

represent vector DNA; straight horizontal ( ) lines represent Tfl 
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insert DNA; dark and light shaded rectangles depict Tfl DNA pol I gene 
sequences. The graphical depictions are not drawn to scale, and not all 
available restriction sites are shown in all steps. 

FIGURE 2 depicts the DNA sequence and the deduced amino 
5 acid sequence for the Tfl DNA pol I holoenzyme coding sequence and for 5 ' 
untranslated and 3 ' untranslated sequences. The circled amino acid (Glu 239 ) 
is the first amino acid believed to be translated during translation of plasmid 
p21EHcMl . 1, encoding the Tfl exo' fragment. The boxed amino acid (Leu^j) 
is the amino acid determined to be the first amino acid of the purified and 
10 isolated major Tfl exo fragment. An asterisk (*) indicates the stop codon 
TAG. 

FIGURE 3 is a comparison of deduced amino acid sequences 
from the Thermus flavus DNA polymerase I of this invention (MBR TFL); 
Thermus aquaticus DNA polymerase I (TAQ) reported in Lawyer et al, J. 

15 Biol. Chem. 264:6427-6437 (1989)); and purported Thermus flavus DNA 
polymerase I (A&V TFL) described in Akhmetzjanov and Vakhitov, Nucleic 
Acids Res. 20:5839 (1992). The sequences were aligned to maximize 
homology. Conservative differences between the amino acid sequences are 
indicated with asterisks (*) and non-conservative differences are indicated with 

20 arrowheads ( A ). 

FIGURE 4 depicts double-stranded DNA sequence of the T. 
flavus DNA pol I gene, including 5' untranslated and 3' untranslated 
sequences. Lower case letters indicate untranslated sequences, upper case 
letters represent the coding sequence. The start codon (ATG) of Tfl DNA pol 

25 I is at positions 301-303, and the stop codon is at positions 2803-2805. The 
positions in the sequence that correspond to synthetic primers used in 
sequencing reactions have been indicated with boxes. The sequence of the 2-4 
fragment is underlined with an arrow (< >). 
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FIGURE 5 depicts the relative DNA polymerase enzymatic 
activity, at different buffered pH levels, of native Thermus flavus holoenzyme 
(nTfl Holo: empty squares); recombinant Thermus flavus holoenzyme (rTfl 
Holo: diamonds); Thermus flavus exo fragment (Tfl exo': circles); T. 
5 aquaticus DNA pol I (AmpliTaq: crossed boxes); and the Taq enzyme Stoffel 
fragment (Stoffel: triangles). 

FIGURE 6A depicts the relative DNA polymerase enzymatic 
activity, at different concentrations of MgCl 2 , of native Thermus flavus 
holoenzyme (nTfl Holo: empty squares); recombinant Thermus flavus 
10 holoenzyme (rTfl Holo: diamonds); Thermus flavus exo fragment (Tfl exo : 
circles); and T. aquaticus DNA pol I Stoffel fragment (Stoffel: triangles). 

FIGURE 6B depicts the relative DNA polymerase enzymatic 
activity, at different concentrations of MnCl 2 , of native Thermus flavus 
holoenzyme (nTfl Holo: empty squares); recombinant Thermus flavus 
15 holoenzyme (rTfl Holo: diamonds); Thermus flavus exo" fragment (Tfl exo : 
circles); and T. aquaticus DNA pol I Stoffel fragment (Stoffel: triangles). 

FIGURE 7 A depicts the relative DNA polymerase enzymatic 
activity, at different temperatures T of native Thermus flavus holoenzyme (nTfl 
Holo: open boxes); recombinant Thermus flavus holoenzyme (rTfl Holo: 
20 diamonds); Thermus flavus exo fragment (Tfl exo : circles); and T. aquaticus 
DNA pol I Stoffel fragment (Stoffel: triangles). 

FIGURE 7B photographically depicts the relative quantities of 
PCR amplification product generated after 25, 30, and 35 reaction cycles, 
using 10 units of Tfl exo fragment (E) or Stoffel fragment (S) as the PCR 
25 DNA polymerase. The far right lane depicts the PCR amplification product 
generated after 35 reaction cycles using 1.1 unit of Tfl exo' fragment. 

FIGURE 8 depicts enzymatic stability in thermal cycling 
(relative DNA polymerase enzymatic activity after different numbers of PCR 
cycles), of native Thermus flavus holoenzyme (nTfl Holo: empty squares); 
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Thermus flavus exo fragment (Tfl exo : circles); T. aquaiicus DNA pol I 
(AmpliTaq: crossed squares); and T. aquaticus DNA pol I Stoffel fragment 
(Stoffel: triangles). 

FIGURE 9 photographically depicts the purity of purified E. 
5 coli DNA polymerase I (Eco Pol I, control), recombinant Thermus flavus 
holoenzyme (rTfl Holo), and Thermus flavus exo fragment (Tfl exo ) on a 
12.5% SDS-PAGE gel stained with silver. 

FIGURES 10A, 10B, 10C, and 10D photographically depict 
portions autoradiographs of sequencing gels showing DNA sequence obtained 
10 with the indicated polymerases substituted into the SEQUAL™ or the Cycle 
SEQUAL™ DNA Sequencing Kit. Abbreviations: recombinant Thermus flavus 
holoenzyme (Tfl Holo); Thermus flavus exo - fragment (Tfl exo); T. aquaiicus 
DNA pol I holoenzyme (AmpliTaq); and the Taq enzyme Stoffel fragment 
(Taq Stoffel). 



*5 DETAILED DESCRIPTION OF THE INVENTION 

This application describes the isolation and characterization of 
the gene coding for Thermus flavus (ATCC Accession No. 33923) DNA 
polymerase I (Tfl DNA pol I) and having homology to the family A enzymes 
described above. Also described is the expression of this gene in E. col: and 

20 the purification and characterization of the recombinant DNA polymerase. 
The, cloning and expression of an active fragment of the Thermus flavus DNA 
polymerase gene is also described, and the gene fragment and expressed 
peptides are characterized. Recombinant vectors and host cells are also 
described. Additionally, methods and kits are described that involve the 

25 DNAs and proteins of the present invention. Thus, as the discussion below 
details, the present invention has several aspects. 

As a first step in the generation of the DNAs and polypeptides 
of the present invention, native T. flavus DNA polymerase 1 was purified and 
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isolated from 7. flavus cells (ATCC Accession No. 33923) and digested with 
trypsin, and amino acid sequence information was obtained from one of the 
reaction products (i.e. from a trypsin digest protein fragment). (See Example 
1.) Additionally, a Thermus flavus genomic library was constructed in phage 

5 X Dash II and amplified. (See Example 2.) 

The amino acid sequence information generated in Example 1 , 
together with published amino acid sequence information from the Thermus 
aquaticus DNA pol I gene, was used to create synthetic DNA primers for 
isolating a portion of the Thermus flavus DNA polymerase I gene. (Example 

10 3.) More particularly, a first primer, designated FTFL2, was synthesized to 
correspond with known coding sequence from T. aquaticus DNA pol I gene 
(Lawyer et al. , /. Biol Chem 264: 6427-6437 (1989)), and to bind to the top 
strand of the T. aquaticus DNA pol I gene. The particular 7. aquaticus 
coding sequence chosen encodes a portion of the 7. aquaticus DNA pol I 

15 amino acid sequence that is homologous to the native T. flavus DNA pol I 
peptide that had previously been sequenced (Example 1). A second primer, 
designated RTFL4, was synthesized to have a sequence that binds to the 
3 '-end of the 7. aquaticus gene on the opposite strand. A DNA amplification 
reaction was performed with primer FTFL2, primer RTFL4, and 7. flavus 

20 genomic DNA. The amplification reaction yielded a single amplification 
product, designated the "2-4 fragment." This fragment was cloned into 
M13mpl8 vector, amplified in E. coli, and sequenced. 

As explained in detail in Example 4, the 2-4 fragment (obtained 
by the procedures outlined in Example 3) was used to isolate the Thermus 

25 flavus DNA pol I gene from the 7. flavus genomic library that had been 
constructed (Example 2). Specifically, the 2-4 fragment was further amplified 
and used to generate probes via thermal cycle labeling (TCL). The amplified 
7. flavus genomic library was plated on 2XTY plates and grown until plaques 
formed. Duplicate plaque lifts were obtained from each plate onto Hybond 
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N filters, and these filters were then screened using the above-described TCL 
probes using hybridization methods well known in the art. Positive plaques 
were selected, purified by dilution and by re-screening with the 2-4 probes, 
and then further characterized. In particular, two clones with inserts of 14-16 
5 kb, designated X21 and X51, were chosen for further analyses. 

Clones X2 1 and X5 1 were used as a starting point from which 
the complete T. flavus DNA pol I gene was cloned and sequenced. As 
explained in detail in Example 5 and with reference to FIGURE 1A, 
restriction mapping, subcloning, and partial sequencing led to the 
10 determination that a subclone of X21 designated p2 IE 10 contained about 2/3 
of the Tfl DNA pol I gene (3' end), whereas a subclone from X51 designated 
p51E9 contained a 5' portion of the gene that overlapped the coding sequence 
contained in clone p21E10. 

A primer walking procedure was used to obtain the complete 
15 sequence of the gene. Specifically, primers homologous or complimentary to 
the ends of previously determined sequences (obtained from p21E10 and from 
other deletion vectors) were synthesized and used in additional sequencing 
reactions. By repeating this process the enure length of the gene was 
sequentially sequenced. The DNA and deduced amino acid sequence for the 
20 T. flavus DNA pol I holoenzyme are shown in FIGURE 2, which corresponds 
to SEQ. ID NO: 1 and 2 in the Sequence Listing. The sequences of each 
primer used, and the relative location of the primers in the gene sequence, are 
depicted in Table 2 and in FIGURE 4, respectively. The amino acid sequence 
of the holoenzyme depicted in FIGURE 2 and SEQ. ID NO: 2 corresponds 
25 with nucleotides 301 to 2802 of the DNA depicted in FIGURE 2 and SEQ ID 
NO: 1. 

The foregoing results demonstrate that an aspect of the 
invention is directed to a purified DNA encoding a thermostable polypeptide 
having DNA polymerase activity, the DNA comprising nucleotides 301 to 
30 2802 of SEQ ID NO: 1. This DNA may be operatively linked to other 
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DNAs, such as expression vectors known in the an. The invention is also 
directed to a vector having at least one insert consisting essentially of 
nucleotides 301 to 2802 of SEQ ID NO: 1, the nucleotides encoding a 
thermostable polypeptide having DNA polymerase activity. Similarly, the 
5 invention is directed to a vector comprising nucleotides 301 to 2802 of SEQ 
ID NO:l, the nucleotides encoding a polypeptide having thermostable DNA 
polymerase activity. 

With the gene sequence established, the DNA and deduced 
amino acid sequences of the T. Jlavus DNA pol I gene were aligned and 

10 compared to the DNA and deduced amino acid sequences of the purported Tfl 
DNA pol I published by Akhmetzjanov and Vakhitov, Nucleic Acids Res. 
20:5839 (1992) (83% DNA sequence homology, 85% amino acid sequence 
homology) and to the deduced amino acid sequence of the Taq pol I gene 
(86% DNA sequence homology, 87% amino acid sequence homology). The 

15 amino acid comparison is depicted in FIGURE 3. 

To produce a recombinant T. Jlavus DNA pol I protein a full- 
length T. Jlavus DNA pol I gene clone was constructed, expressed in E. coli, 
and purified. As detailed in Example 6 and FIGURE 1 A. plasmids p51E9 and 
p21E10 were further restriction mapped and subsequently subcloned to 

20 generate plasmid p21BRV2, containing a 1.3 kb insert that includes the 3' 
region of the Tfl DNA pol I gene, and plasmid p51X16, containing a 2.5 kb 
BamHI fragment in which the 5' region of the gene was located. 
Linearization of plasmid p21BVR2 with BamHI and ligation of this linearized 
plasmid to the BamHI fragment of p51X16 yielded clone pTFL 1.4, 

25 containing the entire Tfl DNA pol I gene. 

E. coli DH5aF' were transformed with plasmid pTFL 1.4 and 
grown in a fermentor to recombinantly produce T. Jlavus DNA pol I 
holoenzyme. As detailed in Example 6, this recombinant protein was purified 
from the lysed E. coli with a method that included a heat denaturation of £. 
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coli proteins, precipitations and centrifugations, Sephadex G-25 and Bio-Rex 
70 column chromatography, and immunoaffmity chromatography. The 
calculated DNA polymerase specific activity of T. flavus DNA pol I isolated 
by this procedure was determined to be 79,500 U/mg protein. 
5 In order to increase the yield of recombinant T. flavus DNA pol 

I holoenzyme, a second expression clone was constructed in which the lacZ 
promoter was fused directly to the initiation codon of the Tfl DNA pol I gene. 
As detailed in Example 7 and FIGURE 1A, the promoter was fused to the 5' 
portion of the gene located using site-directed mutagenesis, and a second 
10 generation expression clone, designated pTFLRT4, was generated. 

E. coli (strain DH5cxFTQ) were transformed with pTFLRT4 
and cultivated, and recombinant T. flavus DNA pol I was isolated therefrom 
and purified. As detailed in Example 7, the purification protocol includes heat 
treatment, polyethyleneimine- (PEI-) precipitation, (NH 4 ) 2 S0 4 - precipitation, 
15 Bio Rex 70 chromatography and immunoaffmity chromatography. The yield 
was approximately 2,000,000 units of enzyme from 500 g of cells, and the 
purified enzyme preparation was found to have a DNA polymerase specific 
activity of 217,600 U/mg protein. The N-terminal amino acid sequence of the 
recombinant Tfl DNA pol I enzyme was determined and found to be identical 
20 to the sequence deduced from the T. flavus DNA Pol I gene sequence. 

The foregoing discussion demonstrates that an aspect of the 
invention is directed to a purified DNA comprising a portion of the insert of 
plasmid pTFLRT4, the plasmid having ATCC Accession No. 69633, the 
• portion encoding a thermostable polypeptide having DNA polymerase activity. 
25 This DNA may be operatively linked to additional DNAs, such as promoter 
DNAs and/or expression vector DNAs known in the art. A preferred DNA 
is plasmid pTFLRT4 itself. The present invention is also directed to 
thermostable polypeptides having DNA polymerase activity. In one aspect, 
the invention is directed to a Thermus flavus DNA polymerase protein 
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substantially free of other Thermits flavus proteins. Exemplary proteins 
include a DNA polymerase protein having the amino acid sequence of SEQ 
ID NO: 2. Similarly, the invention is directed to a thermostable polypeptide 
having DNA polymerase activity and consisting essentially of the amino acid 
5 sequence of SEQ ID NO: 2. 

In addition to the cloning and expression of the Tfl DNA pol 
I holoenzyme, a vector allowing for the expression of a truncated DNA 
polymerase was generated. As explained in Example 8 and FIGURE IB, a 
vector lacking the 5 ' one-third of the T. flavus DNA polymerase I gene was 

10 constructed. Specifically, the ATG start codon of lacZ was brought in frame 
with the DNA encoding amino acids 239 to 834 of the Tfl DNA pol 1 
holoenzyme using site-directed mutagenesis, and the resulting plasmid, 
designated p21EHcMl.l, was expressed in E. coli DH5aF'. The insert of 
plasmid p21EHcMl.l includes a DNA sequence that corresponds with SEQ 

15 ID NO: 3 in the Sequence Listing, and encodes a polypeptide predicted to 
have the amino acid sequence depicted in SEQ ID NO: 4. The expressed 
polypeptide product was designated Thermus flavus DNA polymerase I 
exonuclease-free fragment, or "Tfl exo fragment. " An aspect of the invention 
is directed to a purified DNA comprising a portion of the insert of plasmid 

20 p21EHcMl. 1, the plasmid having ATCC Accession No. 69632, the portion 
encoding a thermostable polypeptide having DNA polymerase activity. This 
DNA may be operatively linked to additional DNAs, such as known promoter 
DNAs and/or expression vectors. A preferred DNA is plasmid p21EHcM 1. 1 
itself. 

25 As detailed in Example 8, the purification protocol for the Tfl 

exo" fragment expressed in E. coli [p20EHcMl-l] included PEl-precipitation, 
gel filtration, Procion-Red Sepharose chromatography and immunoaffinity 
chromatography. The yield using this preparation protocol was approximately 



WO 96/14405 



PCT/US95/15327 



- 19 - 

300,000 units of enzyme from 50 g of cells, and the preparation had a DNA 
polymerase specific activity of 600,000 U/mg protein. 

The N-terminal amino acid sequence of the Tfl exo fragment 
was determined (Example 8), and interestingly, the purified protein lacked 37 
5 N-terminal amino acids predicted from the DNA encoding the exo fragment. 
The deduced amino acid sequence of the purified Tfl exo' fragment -- based 
on this amino acid sequence data and the complete DNA sequence ~ is 
depicted in SEQ ID NO: 5, and corresponds with amino acid 275 to 834 of 
FIGURE 2. A minor sequence lacking 38 N-terminal amino acids was also 
10 detected. 

The foregoing demonstrates that another aspect of the invention 
is directed to a purified DNA encoding a thermostable polypeptide having 
DNA polymerase activity, the DNA comprising a portion of SEQ ID NO: 3. 
For example, the invention is directed to a purified DNA comprising 

15 nucleotides 112 to 1791 of SEQ ID NO: 3. This DNA also may be 
operatively linked to other DNAs, such as to nucleotides 1 to 1 1 1 of SEQ ID 
NO: 3, and/or to expression vectors known in the art. In a related aspect, the 
invention is directed to a vector comprising nucleotides 1 12 to 1791 of SEQ 
ID NO: 3, the nucleotides encoding a polypeptide having thermostable DNA 

20 polymerase activity. Similarly, the invention is directed to a vector having at 
least one insert consisting essentially of nucleotides 112 to 1791 of SEQ ID 
NO: 3, the nucleotides encoding a thermostable polypeptide having DNA 
polymerase activity. 

The recombinant expression and purification of biologically 

25 active Tfl exo' fragment demonstrates additional aspects of the present 
invention. For example the present invention is directed to a purified 
fragment of Thermits flavus DNA polymerase I protein, the fragment having 
DNA polymerase activity. Exemplary fragments include a fragment having 
an amino acid sequence comprising amino acids 2 to 560 of or 1 to 560 of 

30 SEQ. ID NO: 5, and a fragment encoded by the insert of plasmid 
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p21EHcMl.l, having ATCC Accession no. 69632. Also, the invention is 
directed to a polypeptide having DNA polymerase activity and consisting 
essentially of the amino acid sequence of SEQ ID NO: 5. 

The foregoing description of methods and recombinant ceils 
5 demonstrates that the present invention is directed to more than DNA's and 
polypetides. Another important aspect of the invention is directed to a host 
cell transformed with a DNA, vector, or plasmid of the present invention, 
including those specifically mentioned above. Preferably, the host cell 
transformed with a DNA is capable of expressing a thermostable polypeptide 

10 encoded by the DNA, the polypeptide having DNA polymerase activity. By 
host cell is meant both prokaryotic host cells, including E. coli cells, and 
eukaryotic host cells. 

In addition to being directed to DNAs, transformed cells, and 
polypeptides, the present invention is directed to various methods for using 

15 DNAs and polypeptides. For example, the invention is directed to a method 
for purifying a thermostable polypeptide having DNA polymerase activity 
comprising the steps of: transforming a host cell with a DNA to create a 
transformed host cell, the DNA selected from the DNA's of the present 
invention; cultivating the transformed host cell under conditions to promote 

20 expression of a thermostable polypeptide encoded by the DNA, the 
polypeptide having DNA polymerase activity; and purifying the thermostable 
polypeptide with a monoclonal . antibody that is cross-reactive with the 
thermostable polypeptide. In one preferred method, the cross-reactive 
• monoclonal antibody has specificity for a Thermits aquaticus DNA polymerase 

25 and/or for a Thermus flavus DNA polymerase. 

In another preferred method, commercially available 
chromatography columns are used to purify the expressed polypeptide. 

The purification protocols for recombinant Tfl DNA polymerase 
I and Tfl exo" fragment demonstrate that another aspect of the invention relates 

30 to methods of purifying a thermostable polypeptide having DNA polymerase 



WO 96/14405 



PCT/US95/15327 



activity. One such method includes the steps of expressing the thermostable 
polypeptide in a host cell, the polypeptide having an amino acid sequence 
encoded by a DNA of the present invention; lysing the cell to create a 
suspension containing the thermostable polypeptide and host cell proteins and 
5 cell debris; contacting a soluble portion of the suspension with an antibody 
that is immunologically cross -reactive with the thermostable polypeptide under 
conditions wherein the antibody binds to the thermostable polypeptide to form 
an antibody-polypeptide complex; isolating the antibody-polypeptide complex; 
and separating the thermostable polypeptide from the isolated antibody- 

10 polypeptide complex to provide a purified thermostable polypeptide. 
Preferably, such a method further includes the steps of heating the suspension 
to denature the host cell proteins; and centrifuging the suspension to remove 
the cell debris and denatured host cell proteins. In more preferred methods, 
the immunologically cross-reactive antibody is a monoclonal antibody, such 

15 as a monoclonal antibody that is specific for Thermits aquaticus DNA 
polymerase I and/or Thermus flavus DNA polymerase I. This preferred 
method is exemplified herein using a monoclonal antibody purified from a 
hybridoma designated hybridoma 7B12. This monoclonal antibody is 
commercially available from Molecular Biology Resources, Inc., Milwaukee, 

20 Wisconsin, as Cat. No. 4100-01. 

The invention is also directed toward a method of using a DNA 
encoding a DNA polymerase enzyme to generate active fragments of the DNA 
polymerase enzyme, comprising the steps of: deleting a portion of the DNA 
to create a modified DNA, expressing the modified DNA to produce a DNA 

25 polymerase enzyme fragment, purifying the DNA polymerase enzyme 
fragment, assaying the DNA polymerase enzyme fragment for DNA 
polymerase activity, and selecting a DNA polymerase enzyme fragment having 
DNA polymerase activity, wherein the DNA is selected from DNAs of the 
present invention. 
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As detailed in Example 9 and summarized in Table 3 A, a 
number of experiments were conducted to characterize the exonuclease 
activities of T. flavus DNA pol I holoenzyme and exo fragment. For both the 
holoenzyme and the exo' preparation, each exonuclease and endonuclease 
5 activity assayed was either very low or undetectable. 

As detailed in Example 10, a number of additional assays were 
performed to better characterize the recombinant Tfl DNA pol I proteins that 
had been purified and to compare these proteins to other known thermostable 
DNA polymerases. For example, the DNA polymerase activity of the Tfl 
10 holoenzyme and the exo' fragment was analyzed at different pH values, and 
at different MgCl 2 and MnCl 2 concentrations. FIGURES 5 (pH optima); 6A 
(MgCl 2 optima); 6B (MnCl 2 optima) and 7A (temperature optima) summarize 
the results of some of these assays. The optimal range and the peak values 
(in parentheses) are summarized in Table 1A. 



15 



TABLE 1A 




Holoenzyme 


Exo Fragment | 


pH 


9.5 - 10.5 (10) 


7.5 - 10 (8.5) 1 


MgCl 2 [mM] 


>50 


1.3 - 13 (5) 1 


MnCl 2 [mM] 


0.8 - 4 (2) 


2.1-11(4) | 



To assay thermostability the enzymes were incubated for 30 min. at different 
20 temperatures to define the temperature optimum. The highest activity (100%) 
was found at 80°C for the holoenzyme (14% remaining after 30 min. at 
90°C), and 70 to 75°C (8% remaining after 30 min. at 90°C) for the exo 
fragment. 

The Tfl holoenzyme preparation enzyme was more than 95 % 
25 pure as judged by sodium dodecyl sulfate polyacrylamide gel electrophoresis 
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(SDS-PAGE) on 12.5% gel (FIGURE 9). The apparent molecular weight was 
80 kD, which is lower than the calculated molecular weight of approximately 
94 kD based on the DNA sequence. The Tfl hoioenzyme preparation was 
found to be free of detectable double-stranded nucleases and of 5 ' — 3 ' 

5 exonuclease and endonuclease activities. Low levels of single-stranded 
nucleases and of 3' -* 5' exonuclease activity were found. The isoelectric 
point was determined to be 6.43. 

The purified Tfl exo fragment was found to possess low 3 ' - 
5 ' and 5 ' -» 3 ' exonuclease activities. The preparation was more than 95% 

10 pure as judged by SDS-PAGE (FIGURE 9). The apparent molecular weight 
of 68 kD as judged by SDS-PAGE compares well with the calculated 
molecular weight of approximately 63 kD. The Tfl exo" preparation was 
found to be free of detectable double- and single-stranded nucleases and 
endonuclease activities. The isoelectric point was determined to be 5.94. 

15 The performances of the Tfl hoioenzyme and the exo - fragment 

were tested in DNA sequencing, PCR and TCL (Examples 11, 12 and 13). 
Both enzymes were found to be useful in sequencing reactions utilizing labeled 
primer in conjunction with single-stranded and double-stranded DNA 
templates, and in a cycle sequencing reaction with a single-stranded template. 

20 The enzymes were also useful in sequencing reactions utilizing internal 
labeling with, for example, [c* 35 S]-dATP. In all the reactions tested the Tfl 
exo' fragment provided DNA sequence information of more than 150 
nucleotides, as did recombinant Tfl DNA pol I hoioenzyme. 

The Tfl DNA pol I hoioenzyme and the exo fragment were 

25 tested in PCR reactions. The recombinant hoioenzyme gave similar results to 
the native enzyme. The Tfl exo* fragment retained 50% of its activity after 16 
cycles. The hoioenzyme retained 50% of its activity for 20 cycles. The 
specific amplified products were analyzed at the same time. After 20 cycles, 
an amplification product was visible on agarose gels. The amount of product 

30 increased between 25-50 cycles, but decreased after 100 cycles. 
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The native T. flavus enzyme provided with the ZEPTO™ 
Labeling kit (CHIMERx, Madison, WI) was replaced by the recombinant 
holoenzyme or by the (recombinant) exo' fragment. The efficiency of the 
labeling of plasmid pUC19 was determined on agarose gels and the efficiency 
5 of incorporation was determined in dot blot analysis. A dilution of 1:10 s of 
labeled probes generated with the holoenzyme was detectable (1: 10 s for 
probes generated by the exo" fragment). Both results indicated that the 
enzymes have the required activity needed for labeling pUC19 DNA in TCL. 

A protocol is also provided for demonstrating that the present 

10 invention is also directed to TCL in which the recombinant Tfl DNA pol I 
holoenzyme is employed without exogenous primers for enzymatic extension. 
In this method, referred to as Universal Thermal Cycle Labeling (UTCL), 
DNA of unknown sequence is combined intact with rTfl DNA Pol I 
holoenzyme, deoxy ribonucleotide triphosphates, and the appropriate buffer. 

15 The holoenzyme is then combined with intact template and subjected to 
repeated cycles of denaturation annealing and extension. Alpha 32 P-dATP, 
"P-dTTP, 32 P-dGTP, 32 P-dCTP, biotin-dUTP, fluorescein-dUTP, or 
digoxigenin-dUTP is also included in the extension step for subsequent 
detection purposes. 

20 The foregoing results demonstrate further aspects of the 

invention. For example, the invention is further directed to a method for 
labeling DNA, comprising the steps of: digesting an aliquot of template DNA 
with a restriction endonuclease reagent wherein the digestion generates 
sequence-specific DNA fragments; mixing an aliquot of undigested template 

25 DNA with the sequence-specific DNA fragments; denaturing the mixture of 
template DNA and sequence-specific DNA fragments thereby generating 
denatured template DNA and oligonucleotide primers; annealing the primers 
to the denatured undigested template DNA to form a DNA-primer complex; 
and performing an extension reaction from the primers in the DNA-primer 
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complex using Tfl exo fragment in the presence of one or more nucleotide 
triphosphates, wherein at least one nucleotide triphosphate has a label. 

Further, the invention is directed to a method for thermal cycle 
labeling DNA comprising the steps of: digesting an aliquot of template DNA 
5 with a restriction endonuclease reagent wherein the digestion generates 
sequence-specific DNA fragments; mixing an aliquot of undigested template 
DNA with the sequence-specific DNA fragments; denaturing the mixture of 
template DNA and the DNA fragments thereby generating denatured template 
DNA and oligonucleotide primers; annealing the primers to the denatured 

10 undigested template DNA to form a DNA-primer complex; performing an 
extension reaction from the primers in the DNA-primer complex using Tfl 
DNA pol I exo" fragment in the presence of one or more nucleotide 
triphosphates wherein at least one nucleotide triphosphate has a label; heat- 
denaturing the labeled extension products; reannealing the excess primers with 

15 the template DNA and with the extension products; and performing at least 
one additional extension reaction from the DNA-primer complex using a Tfl 
DNA pol I exo" fragment. 

The present invention is further directed to kits for labeling 
DNA. A kit of the present invention includes, in association: a labeling 

20 buffer; a concentrated mixture of 1 or more nucleotide triphosphates; Tfl 
DNA pol I exo" fragment; and a control DNA, the control DNA being useful 
for monitoring the efficiency of labeling. Additionally, the kit may include 
a restriction endonuclease reagent and a restriction endonuclease buffer. 

In another aspect, a kit of the present invention for labeling 

25 DNA comprises, in association: a Tfl DNA pol I exo fragment; and a Tfl 
DNA pol I exo" fragment buffer. Preferably, such a kit further comprises a 
concentrated mixture of 1 or more nucleotide triphosphates and a control 
DNA, the control DNA being useful for monitoring the efficiency of labeling. 

The following examples are intended to describe various aspects 

30 of the invention in greater detail. More particularly, in Example 1, the 
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purification and amino acid sequencing of native Thermus flavus DNA 
polymerase I is described. In Example 2, the construction and amplification 
of a Thermus flavus genomic DNA library is described. In Example 3, the 
cloning and sequencing of a Thermus flavus DNA polymerase I gene fragment 
5 is described. Example 4 details the preparation of gene- specific probes and 
screening of the Thermus flavus genomic library for clones containing the T. 
flavus DNA pol I gene. Example 5 details the sequencing of the T.flavus 
DNA polymerase I gene. In Example 6, the construction and expression of 
a full-length T. flavus DNA pol I clone and purification of full-length 
10 recombinant T. flavus DNA pol I protein are described. In Example 7, the 
construction and expression of a high-yield, full-length T. flavus DNA pol I 
clone and purification of full-length recombinant T. flavus DNA pol I is 
described. Example 8 details the cloning and expression of the exo' fragment 
of F. flavus DNA polymerase I. In Example 9, the characterization of 
15 recombinant T. flavus DNA polymerase I exonuclease activities is detailed. 
In Example 10, studies are described comparing the recombinant T.flavus and 
T. aquaticus DNA polymerases. In Example 11, DNA sequencing with 
recombinant T. flavus DNA polymerases is detailed. Example 12 
demonstrates the utility of recombinant Tfl holoenzyme and the exo' fragment 
20 in polymerase chain reaction procedures. Example 13 demonstrates the utility 
of recombinant Tfl DNA pol I holoenzyme and the Tfl exo' fragment for use 
in thermal cycle labeling procedures. Example 14 analyzes the utility of T. 
flavus DNA pol I holoenzyme and exo : fragment for reverse transcription 
applications. Example 15 demonstrates the increased processivity of Tfl exo 
25 fragment as compared to native or recombinant Tfl DNA pol I holoenzyme or 
Taq holoenzyme. Finally, Example 16 details a large "production scale" 
purification of recombinant Tfl holoenzyme and exo' fragment. 
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EXAMPLE 1 

Purification and Amino Acid 
Sequencing of Native Tfl DNA Pol I 

Native T. flavus DNA polymerase I was isolated from T. flavus 

5 cells and used to generate amino acid sequence information as described 

below. 

Thermits flavus obtained from the American Type Culture 
Collection (ATCC 33923, Catalogue of Bacteria and Bacteriophages, 18th 
Edition, 1992) was cultured as follows: one ampule of Thermus flavus ATCC 

10 33923 was used to inoculate 100 ml culture medium (0.1 g nitrilotriacetic 
acid, 3 g NZ Amine A, 3 g yeast extract, 5 g succinic acid [free acid], 0.001 
g riboflavin, 0.522 g K 2 HP0 4 , 0.480 g MgS0 4 , 0.020 g NaCl, 2 ml Trace 
Metal Solution (0.5 ml H^O,, 2.2 g MnS0 4 , 0.5 g ZnS0 4 , 0.5 g H 3 B0 3 , 
0.016 g CuS0 4 , 0.025 g Na 2 Mo0 4 , 0.046 g cobalt nitrate) per liter, adjusted 

15 to pH 8.0 with NaOH) and the culture was incubated overnight at 70°C with 
shaking. In the morning 10 ml of the overnight culture was used to inoculate 
1000 ml of medium. This culture was grown for about 8 hours at 70 °C and 
then used as an inoculum for 170 liters of medium in a New Brunswick 250 
liter fermentor equipped with a ML 4100 controller. The settings for a typical 

20 fermentation were 3 pounds back pressure, 60 liters/min. (1pm) aeration, 100 
rpm agitation, at 70°C. The fermentation was terminated when the cells 
reached a density of 2 - 3 O.D., as measured at 600nm. The cells were 
cooled down to room temperature and harvested by centrifugation at 17,000 
rpm in a CEPA type 61 continuous flow centrifuge with a flow rate of 2 1pm. 

25 The cell paste was stored at -70 °C until used. 

T. flavus cells (5O0-15O0g) were thawed in 3 volumes of lysis 
buffer (20mMTris-HCI, pH 8.0, 0.5mM ethylenediaminetetraacetate (EDTA), 
7 mM /3-mercaptoethanol (0ME), 10 mM MgCl 2 ) and homogenized. 
Phenylmethylsulfonyl fluoride (PMSF), a protease inhibitor, was added to a 

30 final concentration of 0.3 mM. The suspension was then treated with 0.2 g/1 
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of lysozyme (predissolved in lysis buffer) at 4°C for 1 hr. Cells were 
homogenized twice at 9000 psi in a Manton Gaulin homogenizer, with the 
suspension chilled to approximately 10 °C between passes. New PMSF was 
added to 0.2 g/1 before, between and after passes. NaCl and 
5 polyethyleneimine (PEI) (10% w/v, pH 7.0) were added to the crude, 
homogenized lysate to a final concentration of 0.5 M and to 0.2%, 
respectively. The sample was mixed well and centrifuged at 13,500 x g for 
1 hour. 

The supernatant from the centrifuged lysate was desalted by 
10 diluting with 10 liters of DE52 column buffer (20 mM Tris-HCl, pH 8.0, 0.5 
mM EDTA, 7 mM 0ME) and concentrated to approximately 4 liters using an 
Amicon S10Y30 Spiral Ultrafiltration cartridge. The dilution/ concentration 
step was repeated two times, with a final concentrated volume of about 4 
liters. 

15 The desalted sample was batch contacted with 400 g of 

equilibrated Whatman DE52 ion exchange resin (Maidstone, England). The 
suspension was collected on a sintered glass funnel and washed 3 times with 
1 volume of DE52 column buffer. The resin was then resuspended in a 
minimal volume of buffer and poured into a column (4.5 x 50 cm), packed 

20 and washed with an additional volume of buffer. The column was eluted with 
a 0-0.5 M NaCl linear gradient (total gradient volume: 2000 ml). Twenty- 
five ml fractions were collected at a rate of about 5 ml/min. Peak fractions 
(fractions containing DNA polymerase activity) were determined by a 
modified DNA polymerase assay described by Kaledin et al., Biokhimiya 

25 45:644-65 1 (1980), pooled and dialyzed in approximately twenty-five volumes 
of Affi-Gel Blue (AGB) column buffer (20 mM Tris-HCl, pH 7.5, 0.5 mM 
EDTA, 10 mM /SME, lOmM MgCl 2 , 0.02% Brij 35). 

The dialyzed DE52 peak fractions were applied to an AGB 
column (4.4 x 40 cm, 600 ml packed volume, MBR Blue, Molecular Biology 

30 Resources, Milwaukee, WT), which was washed with 2 column volumes of 
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AGB column buffer, and eluted with a 0-1.2 M NaCl linear gradient (total 
gradient volume: 2000 ml). Twenty-five ml fractions were collected at a rate 
of 1-5 ml/min. The peak fractions were dialyzed as above in AGB buffer. 

The dialyzed AGB peak fractions were applied to a heparin 

5 agarose column (4.4 x 16.5 cm, 250 ml packed volume (Affigel Heparin, Bio- 
Rad, Hercules, CA; or Heparin Agarose, Molecular Chimerics, Madison, 
WI)), which was washed with approximately 2 column volumes (until effluent 
is no longer colored, and column resin is white in appearance), and eluted 
with a 0.1-1.0 M NaCl linear gradient (total gradient volume: 1500 ml). 

10 Twenty-five ml fractions were collected at a rate of 1-5 ml/min. The peak 
fractions were dialyzed in HP Q Sepharose Column Buffer (20 mM Tris-HCl, 
pH 7.5, 0.5 mM EDTA, 7 mM j8ME, 0.1% Brij 35). 

The dialyzed heparin agarose peak fractions were filtered 
through a 0.2 fim filter and applied at 4 ml/min. to the HP Q Sepharose 

15 column (Pharmacia, Uppsala, Sweden) on FPLC. The column was washed 
with several column volumes of buffer, and eluted with a 0-0.25 M NaCl 
linear gradient. Ten ml fractions were collected at 4 ml/minute. The peak 
fractions were dialyzed in HP S Column Buffer (20 mM Na-Citrate, pH 6.0, 
1 mM EDTA, 7 mM ,<3ME, 0.1% Brij 35) or diluted in the same buffer, 

20 depending on the volume of the fraction pool. 

The dialyzed (or diluted) HP Q peak fractions were filtered 
through a 0.2 /tm filter and the HP S column (Pharmacia) was run as above, 
washing with HP S Column buffer and eluting with a 0-0.25 M NaCl 
gradient. Peak fractions were pooled and dialyzed against 4 liters of Final 

25 Storage Buffer (50 mM Tris-HCl, pH 7.5, 0. 1 mM EDTA, 5mM DTT, 50% 
glycerol). The final product was diluted to a concentration of 5000 U/ml in 
the above buffer including 0.5 % Tween 20 (Sigma Chemical Co., St. Louis, 
MO) and 0.5 % Nonidet P40 (Fluka Biochemika, Buchs, Switzerland) as 
stabilizers and stored at -20°C. A typical preparation from 1200 g of cells 
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yields approx. 2,000,000 units (1,700 units/g) or about 40 mg of DNA 
polymerase. 

To quantify DNA polymerase activity, a DNA polymerase 
activity assay was performed using a modification of a protocol described by 
5 Kaledin et al., Biokhimiya 45:644-651 (1980). Reactions were performed in 
a 50 nl reaction mixture of 25 mM Tris-HCl, pH 9.5 at 23 °C; 50mM KC1; 
10 mM MgCl 2 ; 1 mM DTT; 0.2 mM each dCTP, dGTP, dTTP, pH 7.0; 0.2 
mM [a 32 P]dATP, pH 7.0, 10 jiCi/ml; 50 ^g BSA; 15 ng activated DNA 
(Baril et al. Nucleic Acids Res. 3:2641-2653 (1977)); and 5 /xl of diluted 
10 enzyme. For control purposes enzymes (in general AmpliTaq DNA 
polymerase (Perkin Elmer, Cat. No. N801-0060), or Taq DNA polymerase 
purified according to a procedure described by Kaledin et al., Biokhimiya 
45:644-651 (1980)) with known activities are diluted to 20, 40 and 80 
units/ml. Two reactions were run without enzyme as negative controls for 
15 background subtraction. 

A 45 fj.1 reaction mixture, less enzyme, was prepared and the 
reaction was started by the addition of 5 /d of enzyme. After 10 min. of 
incubation at 70°C, 40 /a1 was removed and added to 50 /xl of yeast RNA 
co-precipitant (10 mg/ml in 0. 1 M sodium acetate, pH 5.0). One ml of 10% 
20 trichloracetic acid (TCA) was added and the samples were placed on ice for 
at least 10 minutes. The mixture was filtered on a glass fiber filter disc and 
washed first with 5% TCA/ 2% sodium pyrophosphate, and then with 95% 
ethanol. The dried filter disc was counted in 5 ml of scintillation fluid. 

One unit of activity is defined as the amount of enzyme 
25 required to incorporate 10 nmol of total nucleotide into acid insoluble form in 
30 min. at 70 °C in this assay, the standard activity assay. 

To estimate protein concentration, an aliquot of a native 7". 
flavus DNA polymerase preparation (1100 U/ml) was separated on a 5 - 25% 
SDS-polyacrylamide gel, using the Bio-Rad protocols (Hercules, CA) and the 
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Bio-Rad Mini-Protean II electrophoresis unit. The concentration was 
estimated at 33 ug/m\ when compared to co-electrophoresed protein standards. 

To obtain amino acid sequence information from native T. 
flavus DNA polymerase, about 53 £ig of native polymerase were separated on 
5 a preparative 7.5% SDS-polyacrylamide gel, blotted onto PVDF membrane 
and stained with amido black as described by Matsudaira, J. Biol. Chem. 262: 
10035-10038 (1987). The major band at approximately 83 kD was excised and 
sequenced using an Applied Biosystems (Foster City, CA) 477A Protein 
Sequencer. No N-terminal sequence was obtained under these conditions. 

10 Due to the apparent block at the N-terminus of the native T. 

flavus DNA polymerase I (holoenzyme) , another approach was employed to 
obtain a partial amino acid sequence. Native T. flavus DNA polymerase I was 
digested with trypsin, and the resulting peptides were separated using reverse 
phase high-performance liquid chromatography (HPLC). The N-terminal 

15 amino acid sequences of four of these peptides (peptides 1-4) were 
determined. The amino acid sequence of one of the peptides, peptide 1, is 
LHTRFNQTATATGRLSSSDPNLQNIPVR. This sequence has been 
determined to map at positions 562 to 589 in the deduced amino acid sequence 
of the Tfl DNA pol I holoenzyme described herein (FIGURE 2)). As 

20 explained in Example 3, knowledge of this amino acid sequence information 
was used to isolate the T. flavus DNA polymerase I gene. 

EXAMPLE 2 

Construction and Amplification 
of a Thermus flavus genomic DNA library 

25 A Thermus flavus genomic library was constructed in phage X 

Dash II and amplified in the following manner. 

Genomic DNA from the Thermus flavus, cultured overnight as 

described above, was isolated according to the procedure described by 

Ausubel et ai, Current Protocols in Molecular Biology, Greene Publishing 
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Associates and John Wiley & Sons, New York (1990). In general, yields of 
genomic DNA between 100 and 900 fig were obtained from the cell pellet of 
about 1.5 ml of culture. 

Twenty-five micrograms of Thermits fiavus genomic DNA were 

5 partially digested with 0.3 units of Sau3Al in a total reaction volume of 50 
id. At 0, 5, 10, 15, and 30 min., lO^il samples were removed and the 
enzyme was inactivated at 65°C for 15 min. An aliquot from each time point 
was analyzed on a 1.2% agarose/TBE gel. The 10 min. reaction time 
produced fragments having the desired size distribution (3 kb to 20 kb). 

10 Approximately 2.5 pmoles of 5 '-ends of Sau3 A 1 -digested T. 

fiavus DNA were treated with calf intestinal alkaline phosphatase (CIP) using 
standard techniques (Ausubel et al. , Current Protocols in Molecular Biology 
(1990)). One half of the CIP-digested sample was electrophoresed on a 0.7% 
agarose gel and checked for amount and integrity. The Sau3Al- digested, 

15 CIP-treated T. fiavus DNA was extracted with phenol/chloroform and 
chloroform, ethanol precipitated, pelleted, and washed in 70% ethanol. The 
pellet was stored at -20°C. This DNA is referred to as "CIP TFL DNA." 

The T. fiavus library was constructed as described in the 
manufacturer's instructions using the phage X DASH II/ BamHI Cloning Kit 

20 (Stratagene, LaJolla, CA) and the CIP Tfl DNA. The pME/BamHI test insert 
(0.3 Mg) was run in parallel as a control. The ligation mixture was incubated 
over night at 4°C. 

The T. fiavus DNA ligated to X DASH II arms was packaged 
in vitro using the Gigapack II Gold Packaging Extract from Stratagene, 

25 according to the manufacturer's instructions. Control DNA provided by the 
manufacturer was also packaged. 

Following the protocol provided by Stratagene with the X 
DASH II / BamHI Vector Kit, host bacteria were prepared: Escherichia coli 
VCS 257 (Stratagene) for wild type phage; E. coli SRB and SRB(P2) 
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(Stratagene) for the T. flavus library and the control. VCS 257 was grown in 
NZY+ maltose medium; SRB and SRB(P2) were grown in NZY+ maltose 
medium with 50 fig/ml kanamycin at 37°C for 6 hours. After centrifugation 
of the cells at 2800 x g for 10 min., the cells were resuspended in sterile 10 
5 mM MgS0 4 to give an A«o (optical density at 600nM) of 0.5. 

Two 1: 10 serial dilutions were prepared from the control phage 
and the CIP Tfl DNA library. Ten microliters of undiluted, 1 : 10, and 1: 100 
dilutions of phage were added to 200 /xl of SRB cells. The cells were 
incubated with light shaking at 37°C for 15 minutes and after the addition of 

10 top agar, the mixture was poured onto LB/M/M plates. The plates were 
incubated overnight at 37°C. 

The T. flavus library was amplified using techniques described 
by Ausubel et al. Current Protocols in Molecular Biology (1990), and the 
primary and amplified libraries were titered on SRB cells and the titers are 

15 shown in Table IB. The amplified library was stored at 4*C, 



TABLE IB 

Titer (plaque forming units/ml) 


Construct 


Primary Library 


Amplified Library 


CIP TFL DNA 


4.4 x 10 s 


9.6 x 10 7 


pME/BamHI 


3 x 10* 


1.5 x 10 9 


X Control DNA 


1.1 x 10 9 


Not determined 



EXAMPLE 3 

Cloning and Sequencing a Tfl DNA Pol I Gene Fragment 

The amino acid sequence information derived from four Tfl 
25 DNA pol I peptides (Example 1) was used to design the synthesis of two 
primers for the amplification of a T. flavus DNA polymerase gene fragment: 
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primer FTFL2 (primer "2"; 21mer) (SEQ ID NO: 8) and primer RTFL4 
(primer "4"; 25mer) (SEQ ID NO: 9) (synthesized by Synthetic Genetics, San 
Diego, CA). The sequence of the two primers was also compared to the T. 
aquaticus DNA polymerase sequence (Lawyer et al., J. Biol. Chem 264: 
5 6427-6437 (1989)); the primer nucleotide sequences, with cross-references to 
the Sequence Listing and Sequence ID Nos. are shown in Table 2. Primer 
FTFL2 was chosen because the amino acid sequence obtained from peptide 1 
(Example 1) was identical to a sequence in the Taq DNA polymerase I 
protein. Primer FTFL2 corresponds to nucleotides 1719-1740 of the T. 

10 aquaticus DNA polymerase coding sequence, top strand (i.e., to a portion of 
the sequence that encodes a portion of the Taq DNA pol I protein that is 
homologous to Peptide 1). Primer RTFL4 hybridizes to the 3 '-end of the Taq 
DNA pol I gene at position 2476 - 2500 and has sequence identical to the 
bottom strand (Lawyer et aL , J. Biol. Chem 264: 6427-6437 (1989)). 

15 A typical amplification reaction (100 fxl) contained 0.2 mM 

deoxy nucleotide triphosphates (dNTPs), 1 x Taq Polymerase Reaction Buffer 
(10 x buffer is 100 mM Tris-HCl, pH 8.4, 500 mM KC1, 15 mM MgCLJ, 0.5 
juM of each primer FTFL2 and RTFL4 (primer set 2-4), 50 fi\ mineral oil and 
15 ng T. flavus genomic DNA. After the initial denaturation step (Step 1), 

20 2.5 units of AmpliTac DNA polymerase (Perkin Elmer No. N801-0060, 
Foster City, CA) were added. Negative control reactions containing either 
no enzyme or no template were performed. The amplification program was 
carried out in a thermocycler as follows: Step 1: 95 °C for 5 min.; Step 2: 
hold at 72 °C (for the time required to add the enzyme); Step 3: 55 °C for 45 

25 sec; Step 4: 72°C for 5 min.; Step 5: 95°C for 15 sec.; Step 6: repeat Steps 
3-5 34 times; Step 7: 55°C for 45 sec; Step 8: 72°C for 20 min.; Step 9: 
hold at 4°C until processing the product. Under these conditions primer set 
2-4 gave a single amplification product from T. flavus genomic DNA. The 
observed mobility of the amplification product ("the 2-4 fragment") in 1 % and 
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1.2% agarose gels was in agreement with the 782 bp predicted from the T. 
aquaticus coding sequence. 

The 2-4 fragment was cloned, sequenced, and compared to a 
previously published DNA sequence for a purported T.Jlavus DNA polymerase 
5 I as follows. 

First, to improve cloning efficiency, the 2-4 fragment was 
fractionated, blunt ended, and phosphorylated as follows. Approximately 20 
/xl of the 2-4 fragment was loaded onto a Sephacryl S-500 (400 /d in a spin 
filter, preswollen, pre-equilibrated and stored in 100 mM Tris-HCl, pH 8.0, 

10 1 mM EDTA) column and centrifuged at 2,000 x g for 5 min. to trap the 
unused primers from the PCR reaction. DNA that passed through the column 
was ethanol -precipitated and resuspended in double-distilled water (ddH 2 0). 
The 2-4 fragment was blunt-ended using mung bean nuclease (MBN) 
(Molecular Biology Resources, Inc., Cat. No. 1190-01.), and phosphorylated 

15 with T4 polynucleotide kinase (Molecular Biology Resources, Inc., Cat. No. 
1260-01) prior to ligation to a vector by procedures well known in the an. 

M13mpl8 RF DNA (Life Technologies, Grand Island, NY) was 
restriction-digested with Hinc II and Fxl 136 II to create blunt ends for 
ligation to the above 2-4 fragment. The vector ends were dephosphorylated 

20 with CIP to reduce the probability of self-ligation. The digested and 
dephosphorylated M13mpl8 vector and the 2-4 fragment 
(fractionated/MBN/kinased) were ligated for 2 hr. at room temperature using 
procedures that are well known in the art. 

Using standard techniques, 5 /d of the ligation reaction were 

25 added to 50 td of DH5aF ' E. coli (Life Technologies, Grand Island NY) 
cells, which had first been made competent, and the cells were transformed 
(see Ausubel et al., Current Protocols in Molecular Biology (1990)). 
Different numbers of cells were spread on 2XTY plates and grown until 
plaques appeared. Several plaques were picked, and DNA was prepared using 

30 the Minute Miniprep ssDNA Purification Kit (CHIMERx, Madison WI). It 
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was determined that plaques designated M13-TFL 4.21 and 4.22 contained the 
2-4 fragment in opposite orientations. 

The DNAs M13-TFL 4.21 and 4.22 were sequenced by 
Sanger's dideoxy method (Sanger et al., Proc. Natl. Acad. Sci. 74:5463 
5 (1977)) using the SEQUAL™ Sequencing Kit from CHIMERx. The forward 
sequencing primers (FSP, Table 2, SEQ ID NO: 6) used in sequencing 
M13-TFL 4.21 and M13-TFL 4.22 were end-labeled using [ 7 32 P]-ATP and T4 
polynucleotide kinase. The extension/termination reactions were performed 
according to the protocol provided with the SEQUAL™ Sequencing Kit 
10 (CHIMERx). One microliter of each extension/termination reaction was 
loaded onto a 6% sequencing gel, which was electrophoresed at 3000 volts for 
3 hours. The bands were detected by autoradiography and the sequence was 
determined. 

When the nucleotide sequence from both ends of the 2-4 DNA 
15 was aligned, 771 bp of the approximately 780 bp DNA could be determined. 
This 2-4 DNA sequence was compared with the purported T. flavus DNA 
polymerase sequence reported by Akhmetzjanov and Vakhitov, Nucleic Acids 
Res. 20:5839 (1992), which theoretically should have been amplified by this 
primer set. Eighty-four percent maximum matching, as calculated by the 
20 MacDNAsis software program (Hitachi Software Engineering America, San 
Bruno, CA), was found. This degree of homology, compared with a reported 
DNA polymerase gene, suggested that the 2-4 DNA was indeed part of the T. 
flavus (ATCC 33923) gene and could serve as a useful probe for screening the 
T. flavus genomic library. The homology of 84% also suggested that either 
25 (1) the purported T.flavus strain studied by Akhmetzjanov and Vakhitov and 
the T. flavus strain (ATCC 33923) do not have identical DNA polymerase I 
genes; or (2) the two strains have more than one gene or gene-like sequences 
having homology to DNA polymerase I genes. 
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EXAMPLE 4 
Preparation of Gene-Specific Probes and Screening 
of the Thermus flavus Genomic Library for Clones 
Containing the T, flavus DNA Pol I Gene 
5 The 2-4 fragment described in Example 3 was used to isolate 

the Thermus flavus DNA pol I gene from the T. flavus genomic library 
(Example 2). Using M13-TFL 4.21 as template and primers FTFL2 and 
RTFL4, the 2-4 fragment was amplified by PCR as described above to obtain 
larger quantities of the fragment for use in preparing probes to screen the T. 
10 flavus genomic library. The amplified 2-4 fragment, migrating at about 780 
bp, was cut out of preparative 0.7% agarose gel, eiuted, phenol -chloroform 
extracted and ethanol-precipitated. Approximately 1 Mg of the 2-4 fragment 
was digested with Cv/JI* (CHIMERx) to generate sequence specific primers 
for labeling. A variety of thermal cycle labeling (TCL) probes were prepared 
15 with the 2-4 intact fragment (i.e., Biotin-1 1-dUTP, fluorescein and 
[a 33 P]dCTP probes) in the manner described below. Each set of duplicate 
plaque lifts or targets was screened using two different types of labeled 
probes. Digestion with CviJI*, as well as this method of labeling, is 
described in a co-owned, copending U.S. Patent Application Ser. No. 
20 08/217,459, filed March 24, 1994, entitled "Methods and Materials for 
Restriction Endonuclease Applications," incorporated herein by reference in 
its entirety. The PCT counterpart of this application, filed March 24, 1994, 
is PCT App. No. US94/03246. 

The 2-4 intact fragment was labeled with Biotin-1 1-dUTP as 
25 described in the manual for the ZEPTCT Labeling Kit (CHIMERx). To 
determine the relative efficiency of the amplification reaction, 5/d of the 
amplified 2-4 TCL probe was electrophoresed on a 0.7% agarose gel along 
side a 1 kb molecular size ladder. The amplified probe was evident as a 
smear from 0.1-5 kb, which is an indication of a successful TCL reaction. 
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To determine the efficiency of incorporation of the 
biotin-1 1-dUTP, a dot blot assay was performed as follows: A serial dilution 
of the probe from 1:10 to 1:10 s was made in TE (10 mM Tris-HCI, 1 mM 
EDTA, pH 8.0) and 1/xl of each dilution was spotted on a Hybond-N 
5 membrane (Amersham, Arlington Heights, IL), UV-cross-linked for 3 min., 
followed by colorimetric detection of the incorporated biotin-1 1-dUTP using 
streptavidin-alkaline phosphatase as described in the ZEPTO™ labeling 
manual. The probe was detected at 1: 10 6 dilution suggesting that the biotin- 
labeled 2-4 fragment was efficiently labeled and is highly sensitive for the 

10 screening of the Tfl genomic library. 

The fluorescein labeled 2-4 fragment was prepared and analyzed 
as above except fluorescein- 12-dUTP was used instead of biotin-1 1-dUTP. 
The fluorescein- 12-dUTP incorporation was detected using alkaline 
phosphatase conjugated anti-fluorescein antibody (Boehringer-Mannheim, 

15 Indianapolis, IN) instead of streptavidin-alkaline phosphatase. These probes 
were detected at a- 1: 10* dilution by the colorimetric assay as described in the 
ZEPTO™ labeling manual or by chemiluminescence. Both the biotin and 
flurescein non-radioactive probes were aliquoted and used throughout the 
entire screening process. 

20 The preferred detection method for both the biotin- 1 1 -dUTP 

probes and the fluorescein- 12-dUTP probes was chemiluminescence. For this 
method of detection the filters with hybridized probes were incubated either 
with streptavidinTalkaline phosphatase or alkaline phosphatase conjugated to 
♦ anti-fluorescein antibody for 30 min. at room temperature. They were then 

25 rinsed three times with wash buffer (1 x phosphate buffered saline (PBS), 
0.3% Tween 20 (Sigms Chemical Co., St. Louis, MO) 0.02% Na-azide) for 
15 min. each and finally in assay buffer (0. 1 M diethanolamine, 1 mM MgCl 2 
and 0.02% Na-azide, pH 10) for 5 minutes. They were finally incubated in 
assay buffer containing CSPD™ (Tropix, Bedford, MA) a chemiluminescence 

30 substrate, for 15 min. in the dark followed by exposure to X-ray films. The 
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normal exposure times for the biotin- 1 1 -dUTP probes were 5-30 min. and for 
the fluorescein- 12-dUTP probes were 2-6 hours. 

The 2-4 intact fragment was labeled with [c* 33 P]dCTP as 
described in the ZEPTO™ labeling manual; a total of 6 x 10 7 cpm of 
5 [a 33 P]dCTP at 1 x 10 9 cpm/^g was incorporated. For probes, 1-5 x 10 6 cpm 
of radio-labeled DNA was added to each plaque lift. 

The sensitivity and specificity of the labeled probes was 
demonstrated by screening blots of digested T. flavus genomic DNA. 
Specifically, T. flavus genomic DNA was restricted with different restriction 

10 enzymes, such as Bam'HI, Bgll, Dral, EcoRI, EcoRV and Pad. 250 ng/lane 
of restricted DNA, along with 500 ng of IL-3A viral DNA as negative control 
(Xia, Y., et al., Nucleic Acids Research 15: 6075-6090 (1987)), were 
electrophoresed on a 0.7% agarose gel. A Southern transfer of this gel onto 
Hybond-N was prepared. The denatured DNA on the Southern blots was UV- 

15 cross-linked to the filter for 3 minutes. Duplicate blots were prehybridized 
in 2 ml of hybridization buffer (50% deionized formamide, 7% SDS, 120 mM 
Na phosphate, pH 7.2, 250 mM NaCl, 1 mM EDTA and 1 mM 
cetyldimethylethylammonium bromide and 20/il of denatured salmon sperm 
DNA at 10 mg/ml) in a heat-seaied plastic bag at 52 °C for 1 hour. Seven 

20 m! of either the biotin- 11-dUTP 2-4 TGL probe or the fluorescein- 12-dUTP 
2-4 probe was added to one set of the Southern blots and 1-5 x 10 6 cpm of the 
[a 33 P]dCTP 2-4 TCL probe was added to the duplicate blot. The filters were 
hybridized by incubation overnight at 52°C. 

The filters with the radioactive probe were incubated with low 

25 stringency buffer (1 x SSC, 1% SDS) for 1 hr. at 52°C, washed with high 
stringency buffer (0.1 x SSC, 1% SDS) for 1 hr. at 50°C, dried, and then 
exposed to X-ray film for 3 hours. 

The detection of non-radioactive probes was accomplished as 
described above. Both the biotin- 11-dUTP and the [a 33 P]dCTP 2-4 TCL 

30 probes recognized a large molecular weight band at about 20 kb in all the 
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lanes containing digested T.flavus genomic DNA, although the mobility of the 
bands varied somewhat in the lanes containing different digests. The probes 
did not bind to the control IL-3A DNA, suggesting that the probes were 
specific for the target and could be used to screen the T. flavus genomic 
5 library. 

To screen the amplified T.flavus genomic library (Example 2), 
the phage library was plated on two plates each at 10 s plaque-forming units 
(pfu)/10O mm 2XTY plates. Duplicate plaque lifts on Hybond N from each 
plate were obtained and prepared for hybridization by methods well known in 

10 the art (Sambrook, Fritsch, and Maniatis, Molecular Cloning, A Laboratory 
Manual, 2nd ed., Cold Spring Harbor Press (1989)). The DNA on the plaque 
lifts was UV-cross-linked to the Hybond N for 3 minutes and each plaque lift 
was placed in a heat-sealed plastic bag and prehybridized as described above. 
Seven pi of either the 2-4 biotin-ll-dUTPTCL probe or the 2-4 fluorescein- 

15 12-dUTP probe were added to one set of the plaque lifts and 1-5 x 10 6 cpm 
of the [a 33 P]dCTP 2-4 TCL probe were added to the duplicate filters. The 
filters were incubated overnight at 52 °C and washed the next day with low 
and high stringency buffers as described above. The filters with 
non-radioactive probes were incubated with 10 ml of conjugation buffer (0.5% 

20 casein, 1 x PBS and 0.02% Na-azide) for 30 min. at room temperature. 
Hybridization conditions, washes and detection were as described above. 

Approximately 25 positive plaques (hybridizing with the labeled 
probes) out of 10 5 pfu from the amplified CIP Tfl DNA library were detected 
on the duplicate plaque lifts. 

25 Ten positive plaques were selected (X2 1 , \3 1 , X5 1 , X61 , X71 , 

X81, X91, X101, XI 11 and X121) and were purified by two rounds of dilution 
and screening with the labeled 2-4 probes, until well-isolated, single, positive 
plaques were obtained. 

The four stocks of phage X21, X51, X71, and X91 were grown 

30 at 5 x 10 s pfu/2XTY plate, 5 plates per stock. The phages were eluted from 
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the plates by a standard protocol (Sambrook et ai. (1989)). The eluant was 
treated with 20^g/ml DNase and 50^g/ml RNAse A for 1 nr. at 37°C and 
extracted with both phenol-chloroform and chloroform. The DNA was 
ethanol-precipitated, pelleted, rinsed with ethanol, resuspended in 1 ml of TE 
5 buffer (10 mM Tris pH 8.0, 1 mM EDTA) and purified using the Lambda 
prep kit from CHIMERx. 

The phage DNA was restriction-digested with EcoRI and 
BamHI and electrophoresed on a 0.7% agarose gel, transferred to Hybond N 
and probed with the 2-4 TCL probes. Based on agarose gel band distribution 
10 and Southern blot detection by the 2-4 probes, the four phages were grouped 
into two classes. Clones X21 and X91 belong to one class and the X51 and 
X71 belong to a second class. Clones X21 and X51 were chosen for further 
analyses. 

Clone X21 was digested with BamHI and the T. flavus insen 
15 was subcloned into pTZ18U (Mead et ah, Protein Engineering 1: 61-1 A 
(1986)). Eight of these clones were sequenced using the SEQUAL™ 
Sequencing kit from CHIMERx. One of these clones, designated p21BG, 
hybridized to the 2-4 TCL probes and yielded sequences identical to the 
sequence of the 2-4 fragment between the BamHI and Eco47III sites (these 
20 sites begin at positions 2084 and 2387 in Fig. 4, respectively). This sequence 
information confirmed that the clones contained authentic T. flavus DNA 
polymerase gene sequence, and confirmed the orientation of this gene 
sequence in the clones. 

Based on agarose gel analysis neither X21 nor X51 had any 
25 internal Eco RI sites, hence X2 1 and X5 1 were restriction-digested with Eco 
RI and the insert was cloned into pTZ18U for ease in further analyses. The 
resulting recombinants were designated p21E10 and p51E9, respectively 
(FIGURE 1A). Each clone had an insert of about 14-16 kb. 
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EXAMPLE 5 
Sequencing the JL flavus DNA Polymerase I Gene 

Three main strategies were adopted for sequencing the Tfl DNA 

pol I gene. 

5 In a first strategy, several primers were designed based on the 

purported T. flavus DNA pol I sequence published by Akhmetzjanov and 
Vakhitov (1992) and were synthesized by Synthetic Genetics, San Diego, CA 
(e.g., primers FTFL1C, FTFL11, RTFL12, RTFL13, FTFL15, RTFL16, 
FTFL17 and RTFL18 (Table 2)). Dideoxy- sequencing of the X21 and X51 

10 clones was attempted using these primers, but only primers FTFL17 and 
RTFL18 yielded good sequence data and only very faint bands were obtained 
with primers FTFL11 and RTFL13, suggesting only partial homology to this 
purported DNA pol I sequence. 

In a second sequencing strategy, deletion vectors of p21E10 

15 were obtained by restriction digestion of the plasmid with Hindi, Hindlll, 
SphI, Kpnl and BamHI. These restriction enzymes cut once in the multipie 
cloning site and once or twice in the insert. The digests were diluted to allow 
self-ligation and transformed into E. coli strain DH5aF' by standard methods. 
The clones that ligated back to the vector were selected on ampicillin- 

20 containing plates and picked for further sequence analysis. 

All of these clones had deletions of different lengths at the 3' 
end. The size of the insert in the Hindi deletion vector (p21EHc) was 
approximately 4.6 - 4.7 kb; in the Kpnl deletion vector (p21EK) about 7 kb; 
in the Hindlll deletion vector (p21EHd) about 1.4 kb; in the SphI deletion 

25 vector (p21ES) about 1.6 kb; and in the BamHI deletion vector (p21EB) about 
1.2 kb. The plasmids p21EHd, p21EB and p21ES were sequenced with 
[-y- 32 P] end-labeled FSP by dideoxy sequencing. The sequences obtained were 
within the region of the 2-4 fragment, further confirming the orientation of the 
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insert and the presence of the 3' region of the T. flavus DNA polymerase I in 
the p21E10 parent clone. 

Clone p21EHc is a deletion derivative containing the entire 
portion of the Tfl DNA polymerase I gene DNA present in p21E10 and about 
5 3 kb DNA downstream from the stop codon of the Tfl DNA pol I gene, but 
lacking about 9.0 kb of unwanted 3' end sequence (FIGURE 1A). In 
addition, the DNA sequence obtained from p21E10 and p21EHc using [y- n P] 
end-labeled reverse sequencing primer (RSP, Table 2) suggested that p21E10 
contained only about 2/3 of the DNA polymerase gene and lacked the 5' one- 

10 third of the gene. In contrast, sequence obtained from p51E9 suggested that 
this clone contains a 5' portion of the Tfl DNA pol I gene that overlaps the 
coding sequence contained in p21E10, as well as significant Tfl DNA 
upstream of the gene. 

A primer walking sequencing strategy was employed to obtain 

15 the remainder of the sequence of the Tfl DNA pol I gene. This strategy is 
described as "Directed Sequencing with Progressive Oligonucleotides" in 
Sambrook et al., Molecular Cloning. A Laboratory Manual. Cold Spring 
Harbor Press (1989). To obtain additional sequence information from the 
clones by primer walking, primers homologous or complimentary to the ends 

20 of previously determined sequences obtained from p21E10, from the deletion 
vectors, and from primers FTFL17 and RTFL18 were synthesized as 
described above and used in additional sequencing reactions. By repeating 
this process the entire length of the gene was sequentially sequenced. 

Specifically, TFL primers FTFL17A, RTFL18A, TFLEF1 and 

25 TFLER1 (Table 2 and FIGURE 4) were synthesized by Synthetic Genetics for 
primer walking based on the sequence information obtained from primers 
FTFLI7, RTFL18 and RSP on the p21E10 template DNA (Table 2 and 
FIGURE 4). Primer TFLSF1 was designed as a forward primer for walking 
into the 3' end of the DNA pol I gene, by utilizing sequence information from 

30 p21ES. As more information became available, additional TFL primers 
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RTFLA - FTFLZ (Table 2; FIGURE 4) were designed for sequencing both 
strands. 

The DNA sequence of the T. flavus DNA polymerase I gene 
and flanking sequences are given in FIGURE 2 (SEQ. ID NO 1), along with 
5 the deduced amino acid sequence (SEQ. ID NO. 2). The sequence of 3326 
b.p. has been determined. FIGURE 2 depicts 3048 bases of this sequence, 
of which 2505 bases are deduced to encode a polypeptide of 834 amino acids 
(plus stop codon). The coding sequence was determined to be 86% 
homologous to the Taq polymerase gene and 83% homologous to the 
10 purported Tfl polymerase gene published by Akhmetzjanov and Vakhitov. 

The deduced amino acid sequence of the T. flavus DNA 
polymerase I gene was aligned and compared to the deduced amino acid 
sequences of the purported Tfl DNA pol I published by Akhmetzjanov and 
Vakhitov (85 % homology) and to the deduced amino acid sequence of the Taq 
15 pol I gene (87% homology). As shown in FIGURE 3, the amino acid 
alignment chosen to maximize homology reveals two single amino acid 
insertions in the T. flavus DNA pol I reported here, relative to the other two 
reported sequences. The single amino acid inserts are depicted by dashes (-) 
in the sequences for Taq pol I and for Akhmetzjanov and Vakhitov 's 
20 purported Tfl DNA pol I. 
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TABLE 2 

Primer name: Primer Sequence Sea. Id 



FSP CGC CAG GGT TTT CCC AGT CAC GAC 6 

5 BSP AGC GGA TAA CAA TTT CAC AC A GGA 7 

FTFL2 CTA AGT AGC TCC GAT CCC AAC 8 

RTFL4 ATC ACT CCT TGG CGG AGA GCC AGT C 9 

FTFLIO ATT TAG CAC ATA TGG CGA TGC TTC CC 10 

FTFL11 CTT TCC AGC TCC GAC CCC AAC 11 

10 RTFL12 CCT ACT CCT TGG CGG AGA GCC AGT C 12 

RTFL13 TGG ATG TCC CTC CCC TCC TGA AAG A 13 

FTFL15 CCC TTT CCC GGA AGC TTT CCC AGG TGC A 14 

RTFL16 TGC ACC TGG GAA AGC TTC CGG GAA AGG G 15 

FTFL17 CCT GCA GTA CCG GGA GCT CAC CAA GCT CAA 16 

15 RTFL18 TTG AGC TTG GTG AGC TCC CGG TAC TGC AGG 17 

FTFL17A TGG ACT ATA GCC AGA TAG AGC T 18 

RTFL18A AAG CGA AGA CCT CCT CCT CGA 19 

TFLEF1 AGT TCG GCA GCC TCC TCC ACG A 20 

TFLER1 TCC AAG GAA AGC CTG AGG TCT T 21 

20 TFLSF1 AAG CTC GCC ATG GTG AAG CTC TT 22 

RTFLA TCG GAG ACG AGT TGG TAG AGG T 23 

FTFLB ACC TCT ACC AAC TCG TCT CCG A 24 

RTFLC AGA GGA CGA AGC CCA CGA A 2 5 

RTFLD AGG AGG TAG GCG AGG AGC AT 26 

25 FTFLE ATG CTC CTC GCC TAC CTC CT 2 7 

FTFLF TCG AGG AGG AGG TCT TCG CTT 28 

RTFLG AGC TCT ATC TGG CTA TAG TCC A 29 

FTFLH ATA GGC TCT CCC AGG AGC TT 30 

RTFLI AAG AGC TTC ACC ATG GCG AGC TT 31 

30 RTFLJ TTC CCC TGG AGG CGT TTC TGA 32 

RTFLK AAA GAC CAC GAA GAC GGC CTT 33 

FT FIX AAG GCC GTC TTC GTG GTC TTT 34 

FTFL.M AAG GAG TGG GGA AGC CTG GAA 3 5 

RTFLN TTC CAG GCT TCC CCA CTC CTT 36 

35 RTFLO TTC TTC CGA AGA GGG TTT CCA 37 

RTFLP GCG TCC AGG AGC GCC CTG AGG A 38 

FTFLQ CCT CAG GGC GCT CCT GGA CGC CA 39 

FTFLR TTC GTC CTC TCC CGC CCC GA 40 

FTFLS CCA ACC TGC AGA ACA TCC CCG T 41 

40 RTFLT GGT GTG GAT GTC CTT CCC CT 42 

FTFLU CCC TGC CGT TTA GAG GAA GTT CAA G 43 

RTFLV CTT GAA CTT CCT CTA AAC GGC AGG G 44 

RTFLW ACC CGG CCT TTG GGT TCA AAG A 45 

FTFLX TCT TTG AAC CCA AAG GCC GGG T 46 

45 RTFLY TTC CCG TGC TCC TTC CGC TC 47 

. FTFLZ CTC GCC TTC CTC GTG CCC TT 48 

5 'lac PCR GCT TCC GGC TCG TAT GTT GTG TG 49 

TFL-SDM-1 GGA AAG CCT GAG GTC TTC CAT AGC TGT TTC CTG 50 

TGT GAA ATT GTT ATC CGC TCA CAA TTC CAC ACA 

50 ACA T 

TFL-SDM-3 ACC CGG CCT TTG GGT TCA AAG AGC GGA ACG ATC 51 

GCC TCC ATA GCT GTT TCC TGT GTG AAA TTG TTA 
TCC GCT CAC AAT TCC 
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EXAMPLE 6 

Construction and Expression of a 
Full-Length T. flavus DNA Pol I Clone and Purification 
of Full-length Recombinant T. flavus DNA Pol I 

5 An expression vector containing the full-length T. flavus DNA 

polymerase 1 gene was constructed as described below, utilizing plasmid 

p51E9, which contains the 5' portion of the gene, and plasmid p21E10, which 

contains a 3' portion of the gene that overlaps the 5' portion contained in 

p51E9. FIGURES 1A and IB are provided to illustrate steps in the 

10 construction of expression vectors of this invention, and are not intended to 
be a scale representation of clone inserts, or to contain a complete restriction 
map of clones depicted therein for enzymes shown. 

Referring to FIGURE 1A, clone p51E9 which carries the 5' 
portion of the Tfl DNA pol I gene, was digested with BamHI and a 3.7 kb 

15 digestion product was subcloned into the BamHI cloning site of pTZ18U to 
produce recombinant plasmid, p51B4, which was characterized as containing 
about 1.5 kb of DNA upstream of the DNA pol I start codon contiguous with 
the 5' region of the Tfl DNA pol I gene extending to the BamHI site in the 
2-4 fragment. Plasmid p51B4 was then digested with Xbal, and a 2.5 kb 

20 digestion product was subcloned into the Xbal site of pTZl8U to create 
plasmid p51X16, which contained only approximately 0.3 kb of DNA 
upstream of the DNA pol I start codon. 

Next, plasmid p21EHc (a subclone of p21E10 described above) 
was digested with BamHI and Sail. The 3.7 kb fragment containing the 3' 

25 region of the Tfl DNA pol I gene, beginning with the BamHI site in the 2-4 
fragment, was isolated and subcloned into pTZ18U that had been digested 
with BamHI and Sail to create clone p21BHc. Clone p21BHc was digested 
with EcoRV and BamHI and the 1.3 kb fragment containing the 3' region of 
Tfl DNA pol I was ligated into pTZ18U that had been digested with BamHI 

30 and Hindi, yielding p21BRV2. 
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Plasmid p51X16 was digested with BamHI and the 2.5 kb 
BamHI insert was isolated. Plasmid p21BRV2 was linearized with BamHI 
and ligated to the BamHI fragment of p51X16. The resulting clones were 
designated pTFL 1 .3 and pTFL 1.4. The integrity of the Tfl DNA pol I gene 

5 in clone pTFL 1.4 was verified by DNA sequence analysis using the primer 
RTFLG (Table 2 and FIGURE 4). 

Competent E. coli DH5orF' were transformed with plasmid 
pTFL 1.4 (the 1st generation expression clone), from which a Tfl DNA pol 
I protein was isolated and purified as follows. E. coli DH5aF'[pTFL-l .4] 

10 were grown in a 50 liter fermentor (10 pounds back pressure, 30 1pm 
aeration, 200 rpm agitation, at 37°C) in TB medium (Sambrook et al., 
Molecular Cloning, A Laboratory Manual, 2nd ed. (1989): 12 g 
Bactotryptone, 24 g yeast extract, 4 ml glycerol, 0.1 g MgS0 4 , 2.31 g 
KH 2 P0 4 , 12.54 g K 2 HP0 4 per liter), supplemented with 50 [MgJml ampicillin 

15 with vigorous aeration at 37°C. At 0.0.600= 1.0, IPTG was added to a final 
concentration of 0.5 mM and the cells were cultured for an additional 2 hrs. 
The culture was cooled down to 20° C and 100 ml of 100 mM phenylmethyl 
sulfonyl fluoride (PMSF) in isopropanol was added. After brief mixing, the 
culture was spun down in a Sharpies centrifuge and the pellet (or paste) was 

20 stored frozen at -70°C. Fifty grams of cells were thawed in 250 ml of lysis 
buffer A (20 mM Tris-HCl pH 7.4, 0.5 mM EDTA, 100 mM NaCl, 5% 
glycerol, 5 mM /3-mercaptoethanol, 0.5% Nonidet P40, 0.5% Tween 20, 50 
M g/ml PMSF, 0.5 »g/m\ pepstatin A, 0.5 /xg/ml leupeptin). The cell 
suspension was homogenized twice in a Manton-Gaulin press. Because PMSF 

25 is unstable in aqueous solutions, new PMSF was added again to a final 
concentration of 50 M g/ml after the first and second homogenizations. 

The suspension of broken cells was divided into 100 ml aliquots 
and heated to 65°C for 1 hr. to denature the bulk of E. coli proteins, 
including nucleases, pioteases and E. coli polymerases. Cell debris and 

30 denatured proteins were centrifuged at 6,800 x g for 30 min. and the NaCl 
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concentration of the supernatant was adjusted to 400 mM. The presence of 
DNA polymerase activity in the supernatant was confirmed using the standard 
activity assay described above. Then 10% polyethyleneimine (PEI, pH 7.5) 
was slowly added to a final concentration of 0.2%. After 30 min. of stirring 
5 at 4°C, the suspension was centrifuged (1 nr., 6,800 x g) and the resulting 
supernatant was diluted with 6 volumes of buffer C (20 mM Tris-HCl, pH 
7.4, 0.5 mM EDTA, 5% glycerol, 5 mM 0-mercaptoethanol, 0.5% Nonidet 
P40, 0.5% Tween 20, 50 ng/ml PMSF, 0.5 ng/ml pepstatin A, 0.5 ^g/ml 
leupeptin). Ammonium sulphate was added to 0.55 g/ml and the mixture was 

10 stirred slowly overnight at 4° C. 

After centrifugation for 2 hrs. at 6,800 x g the supernatant was 
carefully removed and tasted for DNA polymerase activity. The polymerase- 
containing pellet was dissolved in a total of 80 ml of Buffer A (10 mM KP0 4 
pH 7.0, 0.5 mM EDTA, 100 mM NaCl, 5 % glycerol, 5 mM (3- 

15 mercaptoethanol, 0.5% Nonidet P40, 0.5% Tween 20, 50 /tg/ml PMSF). 
Insoluble material was removed by centrifugation at 6,800 x g for 20 min. 
The supernatant obtained from this centrifugation (which contains the 
polymerase activity) was loaded onto a 5 x 50 cm Sephadex G-25 column 
equilibrated in buffer A to desalt the solution and to remove traces of PEI. 

20 The flow rate used on this column was about 200 ml/hr. Fractions of 25 ml 
were collected and assayed for DNA polymerase activity. The flow-through 
fractions contained the activity. It was essential to remove all the PEI for 
efficient adsorption to the next column. 

The crude Tfl DNA polymerase described above was applied 

25 to a Bio-Rex 70 column (5 x 10 cm) (Bio-Rad) equilibrated in Buffer A. The 
column was washed with 1.5 1 of buffer A and the DNA polymerase was 
eluted with 4 I of a 0 - 1 M NaCl gradient in buffer A. Fractions of 25 ml 
were collected and assayed for DNA polymerase activity. Fractions 
containing DNA polymerase activity (as assayed below) were pooled, 

30 concentrated in an Amicon concentrator with a YM30 membrane to about 40 
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ml and dialyzed against two changes (1 liter each) of antibody column high 
salt buffer B (20 mM Tris-HCl, pH 7.5, 0.5 mM EDTA, 0.5 M NaCl, 0.05% 
Brij-35) and applied to an immunoaffinity column (1.5 x 8 cm). 

The immunoaffinity column was prepared using techniques 
5 well-known in the art. First, a mouse is injected with purified DNA 
polymerase I to provide an immune response; preferred DNA polymerases for 
generating antibodies are thermophilic DNA polymerases, including those 
isolated from Thermus flavus and Thermos aquaticus. A ten week old female 
BALB/c mouse (Harlan Sprague Dawley, Madison, WI) was immunized by 

10 intraperitoneal injection with Taq polymerase (purified from Thermus 
aquaticus ATCC #25104). To prepare the Taq polymerase for injection, Taq 
storage buffer was removed with a Centricon 30 protein concentrator (Amicon 
Corp.), and the concentrated protein was diluted with phosphate-buffered 
saline. For the initial immunization, 40 fig of Taq emulsified with complete 

15 Freund's adjuvant was injected. Five booster injections of 40^g Taq 
polymerase mixed with equal volumes of the Ribi Adjuvant System (Ribi 
Immunochem Research, Inc., Hamilton, MT) were administered over a 6 
month period, with successive intervals between injections of approximately 
five weeks, 4 weeks, 4 weeks, 12 weeks, and 4 weeks. 

20 Five days after the final booster injection, the mouse was 

sacrificed and spleen cells were isolated and fused with myeloma cells 
(myeloma P3X63-AG8.653 (ATCC CRL 1580)) to generate hybridomas, using 
techniques well-known in the art. See E. Harlow and D. Lane Antibodies, A 
Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor 

25 (1988). In particular, fusions were performed in 50% polyethylene glycol and 
selected in HAT medium. All hybridomas were screened as described below. 

The fifth fusion experiment yielded a useful hybridoma, as 
selected in the following manner. The hybridomas were distributed into 96- 
well plates. Of 1 176 wells filled, approximately 913 showed growth. To 
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initially screen these wells, an ELISA assay was employed. First, polystyrene 
ELISA plates were coated with Thermus flavus (ATCC #33923) DNA 
polymerase (1 M g/ml Tfl DNA pol I (MBR lot 20229) in 100 mM Tris-HCl, 
pH 8.5/0.05% NaN 3 ). Five microliter samples of supernatant from each 
5 culture were diluted into 95 /xl of Tris-buffered saline, pH 8.5/0.05 % Triton 
X-100 (TBST) and then incubated in the coated ELISA plates for 2 hours at 
room temperature. The plates were then washed with TBST. To detect 
positive anti-Tfl DNA polymerase cross-reactivity, a commercially available, 
peroxidase-conjugated goat anti-mouse IgG, y chain specific (Jackson 

10 ImmunoResearch, West Grove, PA), was diluted 5000-fold in TBST, added 
to the ELISA plates, and incubated for 1 hour at room temperature. Positive 
cross-reactivity was detected colorimetrically with 3-methyl-2- 
benzothiazolinonehydrazone/3-dimethylaminobenzoicacid/hydrogenperoxide. 

Supernatants from wells that tested even weakly positive by 

15 ELISA were further screened by immunoprecipitation of both Tfl and Taq 
DNA polymerases using techniques well known in the art. See Harlow and 
Lane, supra. The immunoprecipitation assay employed relies on the presence 
of protein A (which binds IgG) on the surface of Staphylococcus aureus 
(SAC, Sigma Chemical Co., St. Louis, MO). Since protein A does not bind 

20 strongly to a common subclass of mouse IgG, IgG,, but does bind rabbit IgG 
strongly, a pellet of centrifuged SAC cells was first treated with rabbit anti- 
mouse IgG antibodies. The pellet from 10 fxl of a 10% suspension of these 
cells was then incubated with 20 ^1 of hybridoma culture supernatant for one 
hour at room temperature. The resultant SAC cells were centrifuged, washed, 

25 and resuspended in diluted Taq or Tfl polymerase. The polymerase enzyme- 
cell suspensions were incubated overnight at 4°C and centrifuged. The 
resultant supernatant was removed and tested for depletion of polymerase 
activity using a standard radiochemical assay. 

One hybridoma, designated hybridoma 7B12, tested strongly in 

30 the ELISA assay and immunoprecipitated both Taq and Tfl DNA polymerases 
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(70-99% depletion in polymerase activity). More particularly, in a series of 
immunoprecipitations in which the polymerase concentration (Taq or Tfl 
polymerase) was varied, the results shown in Table 2B were obtained. 



TABLE 2B 
IMMUNOPRECIPITATION RESULTS 


Trial 


Polymerase 


Source of Monoclonal antibody 


Depletion 
of 

polymerase 
activity 


2* 


0.04 units Tfl 
polymerase 


hybridoma 7B12 supernatant, 
20 pel 


>99% 


3 


0.09 units Tfl 
polymerase 


hybridoma 7B12 supernatant, 
20 nl 


80% 


3 


0.27 units Taq 
polymerase 


hybridoma 7B12 supernatant, 
20 m1 


79% 


4 


0.30 units Tfl 
polymerase 


1.375 fig purified IgG from 
hybridoma 7B12 


99% 



"Trial 1 was unsuccessful due to an excess of polymerase enzyme relative to 
the amount of antibody. 

In control immunoprecipitation experiments in which six anti-£. coli DNA 
polymerase I monoclonal antibodies were employed (25 fj.\ of hybridoma 

15 culture supernatant), 91 to 1 12% of the original Tfl polymerase activity was 
still detectable in solution (i.e., at most a 9% depletion of Tfl polymerase 
activity). The monoclonal antibody from hybridoma 7B12 was 

further characterized and found to neutralize Taq and Tfl polymerase activity 
at lower temperatures (41 °C). This activity assay was performed at 41°C 

20 rather than at higher temperatures (70°C) where the enzymes are more active, 
because the antibody itself denatures at higher temperatures. 

Cells from hybridoma 7B12 were cloned three times by limiting 
dilution until all wells with growth tested positive in the ELISA assay (66/66 
wells). The monoclonal antibody of hybridoma 7B12, a mouse IgG (yl, k) 
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antibody, is commercially available from Molecular Biology Resources, Inc., 
Milwaukee, Wisconsin as Cat. No. 4100-01. 

Subsequent experiments produced two additional anti-Taq/anti- 
Tfl monoclonal antibodies that may be (but have not been) used to affinity- 
5 purify DNA polymerase enzymes. In particular, two hybridomas producing 
anti-Tfi DNA polymerase monoclonal antibodies, formed using spleen cells 
from a mouse immunized with Tfl DNA polymerase, were identified using the 
sceening procedures outlined above. In the imrnunoprecipitation assay, 25 fj.1 
of supernatant from these two hybridoma cultures, designated hybridomas 
10 10F10 and 11G4, depleted 92% and 95%, respectively, of the DNA 
polymerase activity from a solution containing 0.30 units of Tfl DNA 
polymerase. 

The monoclonal antibodies from hybridoma 7B12 were coupled 
with Emphaze™ resin (3M, St. Paul, MN) as follows. Twenty-five milliliters 

15 of antibody solution (2 mg/ml in 0.6 M sodium citrate, 0.05 M sodium 
chloride, 0.05 M HEPES pH 8.6) was added to 1.25 g of Emphaze™ resin and 
allowed to react for 2 hrs at room temperature. Ethanolamine (1 ml of a 3 M 
solution, pH 9.0) was then added to quench unreacted azlactone functional 
groups and incubated for 1 hr. at room temperature, then overnight at 4°C. 

20 The resin was washed with and stored in PBS with 0.05 % sodium azide. 

The immunoaffinity column for the DNA polymerase 
purification was prepared with about 10 ml dead volume of the resin and 
washed with 300 ml of antibody column high salt buffer B (20mM Tris-HCl 
pH 7.5, 0.5 mM EDTA, 0.5 M NaCl, 0.05% Brij - 35). The Tfl DNA 

25 polymerase enzyme was eluted with 10 mM triethylamine (pH 11.6). 
Fractions (5 ml each) were collected into tubes with 0.01 volumes of 1 M 
HEPES. Those fractions containing the DNA pol I enzyme were identified 
by activity assay, pooled, and dialyzed against storage buffer S (50% glycerol, 
50 mM Tris-HCl, pH 7.5 at 23°C, 5 mM DTT, 0. 1 mM EDTA, 0.5% Tween 

30 20, 0.5% Nonidet P40). The final product was stored at -20°C. 
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The above purification procedure yielded about 60,000 units of 
purified T.jlavus DNA polymerase I from 50 g of E. coli [pTFL-1.4] cells, 
which is equivalent to 1,200 units/g of cells. 

The protein concentration was determined by the method of 
5 Lowry using a modification of the Sigma (St. Louis, MO) Protein Assay Kit 
(Cat. No. P5656) with Bovine Serum Albumin as a standard. Both standard 
and sample were precipitated with TCA prior to the protein analysis. Using 
the standard activity assay, the DNA polymerase specific activity was 
calculated to be 79,500 U/mg protein for the recombinant Tfl holoenzyme 
10 purified as described. 



EXAMPLE 7 

Construction and Expression of a High-Yield, 
Full- Length T. flavus DNA Pol I Clone and Purification 
of Recombinant T. flavus DNA Pol I Holoenzyme 

15 To increase expression of the DNA Tfl pol I gene and to 

increase the yield and DNA polymerase specific activity of recombinant Tfl 
DNA pol I, the lacZ promoter was fused directly to the ATG start codon of 
the Tfl DNA pol I gene using site-directed mutagenesis, the resultant 
improved expression plasmid was expressed, and the recombinant DNA pol 

20 I was purified using a modified procedure. 

Site-directed mutagenesis of single-stranded uracil-(U-) 
containing DNA from p51X16 was performed using the oligonucleotide 
TFL-SDM-3 (Table 2). Single- stranded U-containing DNA was prepared 
according to the protocol provided by Bio-Rad (Hercules, CA) in their 

25 Mutagenesis Kit. The new clone, p51X16Ml, had the lacZ promoter fused 
to about 2 kb of the 5' portion of the Tfl DNA pol I gene (FIGURE 1A). 
Plasmid p51X16Ml was digested with BamHI and Hindi and ligated to the 
1.3 kb BamHI/EcoRV fragment isolated from p21BHc, which provided the 3' 
region of the Tfl DNA pol I gene. The resulting plasmid, pTFLRT4 (ATCC 



WO 96/14405 



PCT/US95/15327 



Accession No. 69633), was used to transform E. coli DHSorF'IQ (Life 
Technologies, Grand Island, NY), generating the 2nd generation expressing 
clone (FIGURE 1A). The presence and integrity of the Tfl DNA poi I gene 
in the insert of pTFLRT4 was confirmed by DNA sequence analysis using 
5 primers RTFLJ, RTFLG, FTFLB, and FTFLE as set out in Table 2. 

E. coli DH5cxFTQ transformed with pTFLRT4 were grown in 
a 250 liter fermentor in TB medium supplemented with 50 ^g/ml ampicillin. 
At O.D.600 = 0.7, expression of the plasmid was induced by the addition of 
IPTG to a final concentration of 0.5mM and the cells were cooled down to 
10 20°C, harvested three hours later, and stored at -70°C until use. 

Five hundred grams of induced cells were thawed and 
suspended in 2500 ml of lysis buffer A (20 mM Tris-HCl, pH 7.4, 0.5 mM 
EDTA, 100 mM NaCl, 5% glycerol, 5 mM 0-mercaptoethanol, 0.5% Nonidet 
P40, 0.5% Tween 20, 50 /ig/ml PMSF, 0.5 ng/ml pepstatin A, 0.5 /zg/ml 
15 leupeptin). 

The cell suspension, to which lysozyme was added to a 
concentration of 0.5 mg/ml, was homogenized in a Manton-Gaulin press. 
Fresh PMSF again was added to the lysed cells to a final concentration of 50 
ptg/ml. The suspension of lysed cells was divided into 300 ml portions, 

20 heated to 65°C for 1.5 hrs., and centrifuged for 30 min. at 6,800 x g. 

Following this centrifugation, the resulting supernatant was 
adjusted to an additional NaCl concentration of 400 mM and 10% PEI, pH 
7.5, was added to a final concentration of 0.2%. After 1 nr. of stirring at 
4°C, the suspension was centrifuged (1 hr., 6,800 x g) and the resultant 

25 supernatant was precipitated with ammonium sulphate. After centrifugation 
for 2 hours at 6,800 x g the resultant pellet was resuspended with 200 ml 
Buffer A and applied to a Bio-Rex 70 column (5 x 10 cm) (Bio-Rad). The 
column was washed with 1.5 I of buffer A and the DNA polymerase protein 
was eluted with 4 1 of a 0-1 M NaCl gradient in buffer A. Fractions of 25 ml 

30 were collected, and the peak fractions were pooled and dialyzed against two 
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changes (2.5 1 each) of antibody column high salt buffer B and applied to an 
immunoaffmity column (1.5 x 8 cm) prepared as described above. After 
washing the immunoaffmity column with 250 ml antibody column high salt 
buffer B, the enzyme was eluted with 10 mM triethylamine (pH 11.6). 
5 Fractions of 5 ml were collected and the peak fractions were dialyzed against 
storage buffer S. This procedure yielded about 2,000,000 units of purified T. 
flavus DNA polymerase from 500 g of E. coli [pTFLRT4] cells, or about 
4,000 units/g of cells as measured using the standard assay described above. 
The calculated DNA polymerase specific activity was 217,600 U/mg for this 

10 preparation of Tfl holoenzyme. 

The N-terminal amino acid sequence from recombinant DNA 
pol I holoenzyme isolated from E. coli [pTFLRT4] was determined as Met- 
Glu-Ala-Ile-Val-Pro-Lea-Phe-Glu-Pro. This sequence matches the amino 
terminal sequence deduced by translation of the T. flavus DNA pol I gene 

15 sequence (FIGURE 2), indicating that the translation starts at the predicted 
position. Unlike the native holoenzyme studied, no blockage of the terminal 
methionine in the cloned holoenzyme was observed. 

EXAMPLE 8 

Cloning and Expression of the 
20 Exo Fragment of T. Jlavus DNA Polymerase I 

Expression studies using plasmids p21E10 and p21EHc were 

performed because these plasmids contain the 3' two-thirds of the DNA 

polymerase I gene fusee to the lacZ operator/promoter. As deduced from the 

DNA sequence, the first amino acid encoded by the insert of plasmid p21E10 

25 corresponds to Glu 239 in FIGURE 2 (circled). It was hypothesized that the 

insert in p21E10 would encode a fragment of DNA polymerase I lacking the 

exonuclease domain (the exo" fragment) due to the absence of the 5' one-third 

portion of this gene. 
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From translation of sequence information obtained from the 
5 '-end of p21E10 using primer RSP (Table 2), it was concluded that the insert 
encoding the 3' two-thirds of the Tfl DNA pol I gene was out-of-frame. It 
was assumed that the same out-of-frame fusion was present in p21EHc. 
5 However, in spite of frame shift some heat-stable DNA polymerase activity 
was obtained from the clone harboring p21E10. 

The ATG start codon of lacZ was brought in frame with the 
TFL DNA polymerase exo" fragment in p21EHc using site-directed 
oligonucleotide mutagenesis (FIGURE IB). A mutagenic oligonucleotide 
i0 TFL-SDM-1 was designed (Table 2), part of TFL-SDM- 1 having homology 
to nucleotides 1015-1032 in FIGURE 4, the other part having homology to the 
vector. Single-stranded U-containing DNA was prepared by standard 
procedures and the chemically synthesized oligonucleotide TFL-SDM-1 was 
used to obtain site-directed changes in the newly synthesized DNA. This 
15 DNA was used to transform competent E. coli DH5aF\ Several 
transformants were selected and grown up for plasmid analysis. Of these, 
sequence analysis was performed on four clones using the [y 32 ?] end-labeled 
primer "5' lac PCR" (Table 2) (Synthetic Genetics). The clones with the DNA 
polymerase gene fragment in the proper reading frame (which includes the 
20 ATG from the lacZ coding sequence, followed by "GAA GAC..." derived 
from the Tfl DNA pol I gene - see FIGURE 2, nucleotides 1015-1020 et seq.) 
were included in expression studies. Overexpressed recombinant protein was 
then isolated and purified from E. coli transformed with one of the clones, 
p21EHcM 1.1 (ATCC Accession No. 69632) by following the procedures 
25 outlined below. 

E. coli DH5aF' [p21EHcMLl] was grown in a 50 liter 
fermentor in TB medium (Sambrook et al., Molecular Cloning, A Laboratory 
Manual, 2nd ed. (1989)) supplemented with 50/xg/mI ampicillin with vigorous 
aeration at 37°C. At O.D.^ = 1.0, IPTG was added to final 0.5 mM 
30 concentration and cells were cultured for an additional 2 hours. The culture 
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was cooled down to 20°C and 100 ml of 100 mM PMSF in isopropanoi was 
added. After brief mixing, the culture was spun down in a Sharpies centrifuge 
and stored frozen at -70° C. 

Fifty grams of E. coli [p21EHcMl.l] were thawed in 250 ml 
5 of lysis buffer A (20 mM Tris-HCl pH 7.4, 0.5 mM EDTA, 100 mM KC1, 
10 mM MgCl 2 , 5% glycerol, 5 mM /3-mercaptoethanol, 0.5% Nonidet P40, 
0.5% Tween 20, 50 /zg/mi PMSF, 0.5 jug/ml pepstatin A, 0.5 ^g/ml 
leupeptin). The cell suspension was homogenized twice in a Manton-Gaulin 
press. After the first and second passes, fresh PMSF was added again to a 

10 new, final concentration of 50 /ng/ml. The suspension of broken cells was 
divided into 100 ml portions and heated to 65 °C for 1 hr. Cell debris and 
denatured proteins were centrifuged at 6,800 x g for 30 min. and the 
supernatant was adjusted to an additional NaCl concentration of 400 mM. 
Then 10% PEI, pH 7.5, was slowly added to a final concentration of 0.2%. 

15 After 30 min. of stirring at 4°C, the suspension was centrifuged (1 hr., 6,800 
x g) and the supernatant was concentrated on a YM30 membrane to 100-120 
ml. The concentrate was run through a 5 x 50 cm Sephadex G-25 column 
equilibrated in buffer A, as described in Example 6. The crude Tfl exo' 
fragment was applied to a Procion-Red Sepharose column (5 x 10 cm). The 

20 column was washed with 1.5 liters of buffer A and the DNA polymerase 
fragment was eluted with 4 liters of a 0-1.5 M NaCl gradient in buffer A. 
Fractions of 25 ml were collected and the fractions with DNA polymerase 
activity were dialyzed against two changes (1 liter each) of antibody column 
high salt buffer B (20 mM Tris-HCl, pH 7.5, 0.5 mM EDTA, 0.5 M NaCl, 

25 0.05% Brij-35) and applied to an immunoaffinity column (1.5 x 8 cm). After 
washing the column with 250 ml of the same buffer, the enzyme was eluted 
with 10 mM triethylamine (pH 11.6) and treated as described above. In 
general, about 300,000 units of purified T.flavus exo fragment were obtained 
from 50 g of E. coli [ P 21EHcMl.l] cells (6,000 units/g). 
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The protein concentration was determined as described above. 
The calculated DNA polymerase specific activity for the Tfl exo" fragment was 
600,000 U/mg. 

Once the Tfl exo - fragment was cloned and expressed the 
5 N-terminai amino acid sequence was determined. About 50 tig of the purified 
enzyme was separated using SDS-PAGE and blotted onto PVDF membrane 
as described for the holoenzymes. The major band was excised and subjected 
to sequence analysis. The chromatogram of the sequencer indicated the 
presence of a major and a minor sequence. The minor sequence represents 

10 the major sequence shifted by one amino acid. The major sequence reads: 
Leu-Glu-Arg-Leu-Glu-Phe-Gly-Ser-Leu-Leu-His-Glu-Phe-X-Leu-Leu-X-Ala- 
Pro-Ala (where X represents an amino acid whose identity was uncertain from 
the chromatogram). The minor sequence has the amino acid sequence: Glu- 
Arg-Leu-Glu-Phe-Gly-Ser-Leu-X-His-Glu-Phe-Gly-X-X-Pro-X-X-Ala-Pro. 

15 The major sequence is identical to the amino acid sequence 

deduced from the recombinant Tfl exo fragment DNA sequence, except for 
the lack of 37 N-terminal amino acids, including the N-terminal methionine. 
SEQ ID NO: 3 and 4 contain the DNA sequence and the deduced amino acid 
sequence of the Tfl exo fragment, as expected from construct p2 lEHcMl . 1. 

20 The loss of the 37 N-terminal amino acids may be due to processing of the 
exo" fragment in the E. coli host. SEQ ID NO: 5 contains the amino acid 
sequence of the major band exo" fragment, as deduced from the N-terminal 
amino acid sequence of the purified exo' fragment and from the DNA 
sequence of plasmid p21EHcMl . 1 . The minor sequence presented here is the 

25 Tfl exo" fragment lacking both the N-terminal methionine and the next 37 
amino acids. Although the amount sequenced in the minor species was small 
there was good correlation with the deduced amino acid sequence, except for 
the proline at position 16, that was expected to be glutamic acid. 
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EXAMPLE 9 

Characterization of T. flavus 
DNA Polymerase I Exonuclease Activities 

The purity and molecular weight of the T. flavus DNA 

5 polymerase and the Tfl exo" fragment were estimated by SDS-polyacrylamide 

gel electrophoresis using the Pharmacia PhastSystem (Piscataway, NJ). 

FIGURE 9 shows the purity of the holoenzyme and the Tfl exo' fragment, 

which were separated on a 12.5% SDS-PAGE gel and stained with silver. 

Assays were performed to determine intrinsic/extrinsic 

10 exonuclease, endonuclease, and DNase activities of the DNA polymerase 

enzyme preparations purified as described above and for T. aquacicus DNA 

pol I holoenzyme (Taq holo) and Stoffel fragment (Stoffel, Perkin Elmer, 

Foster City, CA, Cat. No. N8O8-0038), and for T. thermophilic holoenzyme 

(Tth holo) (Molecular Biology Resources, Inc., Cat. No. 1115-01, 

15 Milwaukee, WI). The protocols are described below and results summarized 

in Table 3A. 

A 3'-* 5' exonuclease activity assay was performed in a final 
volume of 10 fxl containing 50 mM Tris-HCl, pH 7.6, 10 mM MgCl 2 , 1 mM 
DTT, 0. 15 fig of [3'- 3 H dCTP and dGTP labeled]X DNA/Taq I fragments and 

20 5, 10 and 20 units of enzyme. Each sample was overlaid with 10>1 of light 
mineral oil and incubated at 70°C for 1 hour. The reaction was terminated 
by the addition of 50 fi\ yeast RNA and 200/zl of 10% TCA. After incubation 
for 10 min. on ice, the samples were centrifuged for 7 min. in a 
microcentrifuge. 200^x1 of supernatant was added to 6 ml of scintillation fluid 

25 and counted in a scintillation counter. The results are presented in Table 3A 
as the slope % -label released per unit of enzyme. 

A 5' -* 3' assay was performed in a manner identical to the 3' 
-* 5' exonuclease assay, except for the use of [5'- 32 P] X DNA/ Haelll 
fragments instead of the 3'-labeled substrate. 
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Double-stranded and single stranded DNase assays were 
performed using the protocol for the 3' -» 5' exonuclease assay, except for the 
use of [ 32 P] \ DNA instead of the 3 '-labeled substrate. The DNA was treated 
for 3 min. at 100°C and immediately chilled on ice prior to assaying for 
5 single stranded DNase activity. 

An assay for endonuciease activity was performed as follows. 
The reagents (final concentrations of 50 mM Tris-HCl, pH 7.6, 10 mM 
MgCl 2 , 1 mM £-mercaptoethanol), 0.5 /xg pBR322, less enzyme/H 2 0, were 
mixed and kept on ice. The required amount of H 2 0 and 10 ^1 mineral oil 

10 were added to each tube. The reaction was started with the addition of 5, 10 
or 20 units of enzyme; the final reaction volume was 10 id. The samples 
were incubated at 70° C for 1 hour. Two fil of a solution containing 0.25% 
bromophenol blue, 1 mM EDTA, and 40% sucrose was added to the reaction, 
and after a short centrifugation, 6 fx.\ of the bottom layer was removed and 

15 electrophoresed on 1.5% agarose gels in 1 x TBE. The mobility change from 
the supercoiled to the linear form of pBR322 was recorded. 



TABLE 3A 


Enzyme 


3'-*5' exo- 
nuclease 


5'-*3' exo- 
nuclease 


ss 

DNase 


ds 

DNase 


Endo- 
nucie- 
ase 


Tfl holo (r) 


0.19 


0 


0.66 


0 


0 


Tfl holo (n) 


0.04 


0.01 


0 


0 


0 


Tfl exo 


0.03 


0.002 


0 


0 


0 


Taq holo 


0.031 


0.09 


0 


0 


0 


Stoffel 


0 


0.01 


0.28 


0.1 


0 


Tth holo 


0.07 


0.04 


0.2 


0 


0 
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The values for 3'-* 5' exonuclease activity and for 5*-* 3' 
exonuclease activity are low for all DNA polymerases tested. The differences 
in exonuclease and DNase activities between naturally occurring and 
recombinant Tfl holoenzyme are not believed to be statistically significant. 

5 EXAMPLE 10 

Comparison of the T. flavus 
and T. aquaticus DNA Polymerases 

Biological properties of native T. flavus DNA pol I (nTfl Holo, 

lot #30419; Molecular Biology Resources, Inc., Cat. No. 1112-01, 

10 Milwaukee, WI); recombinant T. Jlavus DNA pol I holoenzyme (rTfl Holo) 
purified from E. coli [pTFLRT4]; T. Jlavus DNA pol I exo fragment (Tfl exo 
) purified from E. coli [p21EHcMl.l]; T. aquaticus DNA polymerase I (native 
Taq or recombinant AmpliTaq) holoenzyme; and the AmpliTaq DNA 
polymerase Stoffel fragment were compared using a number of protocols 

15 described below. 

The molecular weights and purities of the preparations of the 
various enzymes were estimated by acrylamide gel electrophoresis utilizing the 
Pharmacia PhastSystem (Piscataway, NJ) for electrophoresis and silver 
staining. A comparison of the apparent molecular weights estimated from 

20 7.5% and 12.5 % acrylamide gels and the calculated molecular weights derived 
from available sequence data is given in Table 3B. The apparent molecular 
weight of the holoenzymes using either acrylamide concentration was less than 
the calculated molecular weights. A purity of greater than 95 % was estimated 
for all DNA polymerases analyzed: i.e. Tfl and Taq holo enzymes, Tfl exo 

25 fragment and Stoffel fragment. 
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TABLE 3B 

Apparent Mol. Weight 


Enzyme 


7.5% gel 


12.5% gel 


Calculated ; 
Mol. Weight 


Tfl holoenzyme* 


80,000 


84,000 


93,969 


Recombinant Tfl exo 
1 fragment (recomb.) 


59,000 


59,000 


62,979 


1 Taq holoenzyme 


82,000 


85,000 


93,904 


| Stoffel 


60,000 


61,000 


61,000 



native and recombinant 



10 Using the Pharmacia PhastSystem, the polymerases and 

standards were subjected to isoelectric focusing. The experimentally derived 
pi values of the samples, including samples of E. coli DNA pol I holoenzyme 
(£". coli pol I) and Klenow fragment, were compared to values calculated from 
derived amino acid sequence information. The results are given in Table 4. 



1 TABLE 4 

[ pi Values 


Enzyme 


Calculated pi 


Measured pi 


| nTfl Holo 


6.23 


6.25 


rTfl Holo 


6.23 


6.43 


A&V Tfl' 


5.73 


(not available) 


Tfl exo" 


6.37 


5.94 


Taq Holo 


6.00 


6.14 


Taq Stoffel 


5.93 


5.83 


E. coli pol I 


5.29 


5.12 


Klenow 


5.60 


5.75 



'Purported T. Jlavus DNA pol I protein sequence published by 
Akhmetzjanov and Vakhitov. 
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The relative DNA polymerase activities of the enzymes were 
assayed at 70°C at different pH values. The pH of selected buffers were 
adjusted at 23°C, to permit direct comparison to published results. Table 5A 
shows the measured pH values at 70 °C for lx buffers which first had been 
5 titrated at 23 °C. Unless otherwise indicated, pH values reported herein were 
adjusted at about 23 °C. 



TABLE 5A 

Change of pH as a function of temperature 


No. 


Buffer 


pH at 23°C 


pH at 70°C 


1. 


PIPES-NaOH 


6.0 


5.5 


2. 


PIPES-NaOH 


6.5 


6.0 


3. 


Tris-HCI 


7.5 


6.4 


4. 


Tris-HCl 


8.0 


7.0 


5. 


Tris-HCI 


8.5 


7.4 


6. 


Tris-HCl 


9.0 


8.0 


7. 


Tris-HCI 


9.5 


8.6 


8. 


Triethylamine-HCl 


9.5 


8.9 


9. 


Triethylamine-HCl 


10.0 


9.15 



20 The activity assays were performed in a 100 ^1 (final volume) 

reaction mixture, containing 0.1 mM dCTP, dTTP, dGTP, [a 33 P]dATP, 0.3 
mg/ml activated calf thymus DNA and 0.5 mg/ml BSA in a set of buffers 
containing: 50 mM KC1, 1 mM DTT, 10 mM MgCl 2 and 50 mM of one of 
three buffering compounds: PIPES, Tris or Triethylamine. Three dilutions 

25 (20, 40 or 80 U/fx\) of each polymerase enzyme were prepared, and 5 /nl of 
a dilution was added to the reaction mixture, followed by incubation at 70 °C 
for 30 min. The experiment was performed twice, each time using duplicate 
samples. FIGURE 5 graphically depicts the relative activities of the enzymes 
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studied, calculated as the ratio of counts per minute (corrected for background 
and enzyme dilution) at a given pH to counts per minute at the maximum 
value for that enzyme. The optimal ranges (>80% activity) for the five 
enzymes tested are provided in Table 5B. 



5 


TABLE 5B 




Optimal pH ranges (as titered at 23 °C) 




Enzyme 


PH 




Native Tfl holoenzyme 


9.5-10.5 




Recombinant Tfl holoenzyme 


9.5-10.5 


10 


Tfl exo* fragment 


7.5-9.8 




Stoffel fragment 


7.5-9.8 




Ampli Taq 


7.5-9.3 




These values are about 1 pH unit higher 


than for butters measured at 70 °C 



(see Table 5 A). 

15 The pH protocol described above was modified to determine the 

influence of MgCl 2 concentration on the activities of the DNA polymerases. 
The reaction buffers included 50 mM Tris-HCl pH 8.3 (23°C) and MgCl 3 
concentrations from 0.36 to 50 mM. Three independent experiments were 
performed and curves were constructed (FIGURE 6A) showing the relative 

20 activity of Tfl exo' fragment, Tfl holoenzyme (native and recombinant), and 
Taq Stoffel fragment. The higher limit for the Stoffel fragment was 
extrapolated. The optimal ranges ( > 80% activity) are 1.3-13 mM MgCl 2 for 
the Tfl exo - fragment, and 2.3-33 mM MgCl 2 for the Stoffel fragment. The 
recombinant and the native Tfl holoenzyme showed greatest activity at 50 mM 

25 MgCl 2 . 

The above protocol was modified to determine the influence of 
MnCl 2 concentration on the activities of the DNA polymerases (in the absence 
of magnesium ions). The reaction buffers included MnCl 2 concentrations 
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from 0.1 to 20 mM. Due to the precipitation of oxidation products (MnOj) 
of MnCl 2 , the reaction buffers were prepared just prior to the assay. The pH 
of the buffer was adjusted to pH 8.7 before the addition of a 1 M stock 
solution of MnCl 2 . The pH was finally adjusted to 8.3 at 23°C. Three 
5 independent experiments were performed and a curve was constructed 
(FIGURE 6B) showing relative activity of the enzymes. The optimal ranges 
for the four enzymes tested are 2.1-11 mM MnCl 2 for the Tfl exo fragment, 
4-20 mM MnCl 2 for the Stoffel fragment and 0.8-4 mM MnCl, for the 
recombinant and native Tfl nonenzymes. 

1 0 The thermostability and temperature optimum of the polymerase 

enzymes were determined by incubating 10 units of enzyme for 30 min. at 
23 , 37, 60, 65 , 70, 75 , 80, and 90°C, in 100 pi of buffer used for the 
determination of polymerase activity (including 50 mM Tris-HCl pH 8.3 
(23°C) and 1 .5 mM MgCl 2 ) in a DNA polymerase activity assay as described 

15 above. The polymerase activity was then determined by acid precipitation of 
the polymerization product as described above. FIGURE 7A depicts the 
percent relative activity, calculated as described above. The temperature 
optima were 70-75°C for the Stoffel and Tfl exo fragments and 80°C for the 
native and recombinant Tfl holoenzymes. At 90°C there was 14%, 6% and 

20 8% of the activity left in the Tfl holoenzymes, the Stoffel fragment, and the 
Tfl exo fragment, respectively. 

The PCR half lives of the enzymes were determined in 100>1 
PCR reactions, performed in duplicate, substituting the appropriate buffer in 
the PCR cocktail prepared for each individual enzyme. The cocktail for the 

25 Tfl exo' fragment contains 1 x Tfl pol buffer (50 mM Tris-HCl, pH 9.0 at 
23°C, 20 mM (NH 4 ) 2 S0 4) 1.5 mM MgCl 2 ), 200 pM of each dNTP, 0.5 pM 
of primer FTFL2 and primer RTFL4, and 15 ng of T. flavus genomic DNA. 
The buffers for other enzymes tested were as follows: Taq pol I (1 x Taq pol 
buffer: 10 mM Tris-HCl pH 8.4, 50 mM KC1, 1.5 mM MgCl 2 ); Tfl DNA pol 

30 I holoenzyme (1 x Tfl pol buffer); and Stoffel fragment (1 x Stoffel buffer: 10 
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mM Tris-HCl, pH 8.3, 10 mM KC1 and 2.5 mM MgCl 2 ). Duplicate samples 
were denatured at 95 °C for 5 min. and held at 72 °C until 10 units of enzyme 
were added, and the samples were then cycled 0, 20, 25, 30, 35, 50 and 100 
times as described in Example 3. The samples were analyzed on 1.2% 
5 agarose gels using ethidium bromide to visualize the presence of specific PCR 
product. The expected length of PCR product was about 800 bp. Reactions 
containing Taq DNA pol I, Stoffel and Tfl exo" fragments had visible product 
after 20 cycles, whereas reactions with Tfl holoenzyme showed product only 
after 30 cycles. The Tfl exo fragment synthesized more product than the 

10 Stoffel fragment (FIGURE 7B). In general, there was some background, very 
likely because of the large amount of enzyme in the reaction. This 
background was not observed when 1 to 5 units of Tfl exo fragment were 
used in a 35 cycle regimen. (FIGURE 7B, Right lane.) The polymerase 
activity in each tube was also determined as described above following 

15 completion of the PCR cycling, and the result plotted in a enzyme cycling 
stability curve (FIGURE 8). The half life was estimated to be: 25 cycles for 
both the Taq holoenzyme and Stoffel fragment, 20 cycles for the Tfl 
holoenzyme, and 16 cycles for the Tfl exo fragment. 

EXAMPLE 11 

20 DNA Sequencing with T. flavus DNA Pol vine rases 

A. 

Native and recombinant Tfl holoenzyme, Tfl exo" fragment, 
• AmpliTaq, and Stoffel fragment were employed in the SEQUAL™ DNA 
Polymerase Sequencing System (CHIMERx) to test their performance in DNA 
25 sequencing using ssDNA template and labeled primer. 

The primer FSP (Table 2) was end-labeled with T4 kinase and 
[-y 32 P]ATP according to the CHIMERx protocol. A 10 fx\ labeling reaction 
was prepared containing 0.5 fil primer (10 pmol/fd), 1.0 jul T4 Kinase 10X 
buffer (500 mM Tris-HCl pH 7.5, 100 mM MgCl 2 , 50 mM DTT, 1 mM 
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spermidine), 3.0 pi [ 7 - 32 P] ATP (6000 Ci/mmol, 10pCi/pl), 0.5 pi T4 kinase 
(15 U/pl), and 5.0 /xl H 2 0. The labeling reaction was incubated at 37°C for 
10 min., and the kinase was inactivated by incubation at 65 °C for 10 min. 

The sequencing reactions for native and recombinant Tfl 
5 holoenzyme and the exo fragment were set up using 2-5 units of enzyme, 
according to CHIMERx conditions. Briefly, a reaction cocktail was prepared 
containing 16.0 pi ssM13mpl8 DNA (approx. 1 pg), 5.0 pi Sequal 5x buffer 
(250mM Tris-HCl, pH 9.5, 12.5 mM MgCl 2 ), 1.0 pi labeled primer (0.5 
pmol/pl), and 2-5 units of enzyme. Four d/ddNTP mixtures were also 

10 prepared (A mix: 20 pM dATP, 60 pM dCTP, 60 pM dGTP. 60 pM dTTP, 
300 pM ddATP; C mix: 60 pM dATP, 20 /xM dCTP, 60 pM dGTP, 60 pM 
dTTP, 150 pM ddCTP; G mix: 60 pM dATP, 60 piM dCTP, 20 pM dGTP, 
60 pM dTTP, 30 pM ddGTP; T mix: 60 pM dATP, 60 /iM dCTP, 60 pM 
dGTP, 20 /xM dTTP, 400 /iM ddTTP). The sequencing reactions were 

15 performed by mixing 5 /ul of reaction cocktail with 1 pi of the appropriate 
d/ddNTP mixture and heating the reaction tube at 65 °C for 10 min. After 
this incubation 3 (x\ of stop solution (EDTA/DTT/Bromophenol biue/xylene 
cyanol) were added and the reactions were placed on ice. 

AmpliTaq reaction cocktail was prepared by using the lOx 

20 Reaction Buffer provided with the Cycle Sequencing Kit, which contains 2 
units of enzyme in a final volume of 20 pi. Stoffel fragment reaction cocktail 
(20 id) contained 4 pi of 5x Stoffel fragment reaction buffer, 2 pi of 25 mM 
MgCl 2 , both provided with the enzyme (Perkin Elmer), and 2 units of 
enzyme. Both cocktails included lpg of ssDNA template and 1 pi of labeled 

25 primer FSP. For both enzymes, sequencing reactions were performed by 
mixing 5pl reaction cocktail with lpl d/ddNTP mixtures and incubating for 
10 min. at 65 °C. Three microliters of stop solution were then added to the 
reactions, and the reactions were placed on ice. 

The reactions were heated at 90°C for 5 min. just before 

30 loading onto a 6% sequencing gel. One microliter of each sample was loaded 
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and electrophoresed at 3000 volts for 1.5 hours. The gel was autoradiographed 
and analyzed. FIGURE 10 A photographically depicts a portion of a 
sequencing gel showing the same DNA sequence for all enzymes used, except 
for the native T. flavus DNA pol I holoenzyme control. Very little 
5 background was observed when the Tfl exo' fragment, Tfl holoenzyme and 
AmpliTaq were used. The Stoffel fragment had more ghost bands than the 
other enzymes. However, no attempt had been made to optimize the reaction 
conditions for the Stoffel fragment. 

B. 

10 To demonstrate the utility of the recombinant Tfl holoenzyme 

and the Tfl exo" fragment in cycle sequencing with single-stranded DNA 
template, these enzymes were substituted into the S EQUAL™ DNA 
Polymerase and Cycle- S EQUAL™ Sequencing System (CHIMERx) and the 
protocols provided were followed. 

15 The labeling protocol described above was repeated to create 

end-labeled primer. A 22 pi reaction cocktail was then prepared containing 
approx. 20 ng ssM13mpl8 DNA, 5.0 pi 5X Sequal sequencing buffer, 1.0 pi 
labeled primer (0.5 pmol/^1), balance H 2 0. Native or recombinant Tfl 
holoenzyme or Tfl exo fragment was then added to the cocktail (0.5 units) 

20 and gently mixed. For comparison purposes, two units of AmpliTaq or 
Stoffel fragment were added to the ssM13mpl8 DNA template (20 ng); the 
manufacturer's reaction conditions for the Perkin Elmer Cycle Sequencing Kit 
were followed for AmpliTaq, and, for the Stoffel fragment, 4 ^1 of the Stoffel 
buffer and 2 pi of the MgCl 2 solution provided with the enzyme were used. 

25 The sequencing reactions were performed by mixing 5 pi of 

reaction cocktail with 1 pi of each d/ddNTP mixture (in separate tubes), 
adding a drop ( ~ 10/zl) of mineral oil to each tube, and placing the tubes into 
a preheated (94°C) thermal cycler programmed to run the following cycle 
twenty times: 94°C for 15 seconds (denaturation), 70°C for 60 seconds 
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(extension). The reactions were cooled to 4°C after 20 cycies until 4^1 stop 
solution were added, and then the reactions were set on ice. 

Immediately after heating the reaction mixtures at 90 °C for 5 
min., one microliter of each reaction mixture was loaded onto a 6% 
5 sequencing gel. FIGURE 10B shows that the Tfl exo fragment and 
recombinant Tfl holoenzyme yield clean sequence data, whereas in the 
AmpliTaq lanes some ghost bands were observed. The Stoffel fragment, 
under the conditions used here, did not produce comparable sequencing data. 
C. 

10 The utility of recombinant Tfl holoenzyme and Tfl exo 

fragment for sequencing with internal labeling using double-stranded DNA 
template was demonstrated in a sequencing reaction in which a |> 3S S]-dATP 
labeling protocol and double stranded pUC19 template were used. The 
experiment was performed as outlined in the S EQUAL™ DNA Polymerase 

15 Sequencing System (CHIMERx) with 2 /xg of pUC19 dsDNA and 2.5 units 
of the enzymes. 

To promote efficient priming, the pUC 19 double-stranded DNA 
template was denatured by adding deionized H 2 0 to 18 jtl, adding 2 yil of 2M 
NaOH, and incubating for 5 min. at room temperature. The reaction was 

20 neutralized by adding 2 fx\ of 2M ammonium acetate, pH 4.6, ethanol 
precipitated, air-dried, and resuspended in 10>1 deionized water. 

For each enzyme, a 22.75 pi extension/labeling cocktail was 
prepared with the 2 /xg denatured pUC19 dsDNA, 5.0 pi 5X Sequal buffer, 
1.0 pi primer (0.5 pmol/pl), 1.0 fx\ alpha labeling mix (-45 pM each of 

25 dCTP, dGTP, dTTP), 0.25 p\ [a- 35 S] dATP (1000 Ci/mmol), 2.5 units 
enzyme, balance H 2 0. This cocktail was incubated at 65° for 10 min. 

Extension/termination reactions were performed by adding 5 p\ 
of extension/labeling cocktail to tubes containing 1 pi of the appropriate 
d/ddNTP mix, and mixing gently. The reaction tubes were immediately 



WO 96/14405 



PCT/US9S/15327 



- 70 - 

placed at 65 °C for 4 min., stopped by addition of 4 ^1 step solution, and set 
on ice. 

Each reaction was heated at 90° for 5 min. immediately before 
loading 1-2 nl onto a sequencing gel. Results are depicted in FIGURE 10C. 
5 Native Tfl holoenzyme (not shown) was compared to recombinant holoenzyme 
and to the Tfl exo' fragment. The bands were comparable for the 
holoenzymes. The quality of the sequence data is comparable although the 
signal was weaker when the Tfl exo" fragment was used. 

D. 

10 The utility of recombinant Tfl holoenzyme and the exo 

fragment for double- stranded sequencing using a labeled sequencing primer 
was demonstrated by substitution of these enzymes into the SEQUAL™ System 
which uses 2 fig pUC19 dsDNA and [^Pj-labeled primer FSP. The 
reactions were performed according to CHIMERx's protocol. More 

15 particularly, the double-stranded template was first denatured as described 
above, and then sequencing reactions were performed essentially as decribed 
in pan A (substituting the pUC19 denatured dsDNA for ssM13mpl8 
template). As can be seen in FIGURE 10D, both the Tfl holoenzyme and the 
Tfl exo fragment produced good sequence data. 



20 EXAMPLE 12 

Polymerase Chain Reaction 
The utility of recombinant Tfl holoenzyme and the exo" 
fragment in PCR was demonstrated as follows. In a 0.5 ml reaction tube 85 
Ml water, 2 10 mM dNTPs, 10 ml 10 x Tfl Polymerase Reaction Buffer (10 

25 x buffer is 500 mM Tris-HCl, pH 9.0, 200 mM (NH^SO,, 15 mM MgCI 2 ), 
1 nl each of 50 primers FTFL11 and RTFL12 (primer set 1 1-12), 50 fil 
mineral oil and 1 /xl of 15 /xg/ml T.flavus genomic DNA were combined. 
After the initial denaturation step (Step 1), 5.5 and 11 units of Tfl exo 
fragment, or 5 units of Tfl holoenzyme were added. As a control Taq pol I 
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in 1 x Taq Polymerase Reaction Buffer (Example 10) was used to amplify the 
genomic DNA. Amplifications were performed in a MJ Research PTC- 100 
Cycler with external sensor control. The amplification program was: Step 1: 
95°C for 5 min.; Step 2: hold at 72°C; Step 3: 55°C for 45 sec; Step 4: 
5 72°C for 5 min.; Step 5: 95 'C for 15 sec.; Step 6: repeat steps 3-5 thirty-four 
times; Step 7: 55°C for 45 sec.; Step 8: 72°C for 20 min.; Step 9: hold at 
4°C. 

The amplification products were separated on 1.2% agarose 
gels. Primer set 11-12 gave a single amplification product from T. flavus 
10 genomic DNA. Five units of the Tfl exo fragment produced a single product: 
the yield was slightly less than that obtained with Taq polymerase and better 
than the yield from Tfl holoenzyme. 



EXAMPLE 13 
Thermal Cycle Labeling With Tn DNA Pol I 

15 The protocol described in Example 4 was used to demonstrate 

the utility of recombinant Tfl DNA pol I holoenzyme and the Tfl exo 
fragment for thermal cycle labeling (TCL). See co-owned, co-pending U.S. 
Patent Application Serial No. 08/217,459, filed March 24, 1994, entitled 
"Materials and Methods for Restriction Endonuclease Applications." PCT 

20 Application No. US94/03246, filed March 24, 1994. 

Thermal Cycle Labeling (TCL) is a procedure for labeling 
double-stranded DNA while simultaneously amplifying large amounts of the 
labeled probe. TCL of DNA requires two general steps: 1) generation of the 
sequence- specific oligonucleotides by Cv/'JI* (Molecular Biology Resources, 

25 Milwaukee, WI) restriction of the template DNA; and 2) repeated cycles of 
denaturation, annealing, and extension in the presence of a thermostable DNA 
polymerase or a functional fragment thereof which maintains polymerase 
activity. Optimal results are obtained after 20 such cycles, which is best 
performed in an automated thermal cycling instrument such as a Perkin-Elmer 
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Model 480 thermocycler. In conjunction with such an instrument, about 1.5 
hr. is required to complete this protocol. If a thermal cycler is not available 
these reactions may be performed using heat blocks. As few as 5 cycles may 
yield probes with acceptable detection sensitivities. The generation of 
5 sequence specific oligonucleotides for use in this method may also be 
accomplished using the restriction endonuclease reagent CGase I (Molecular 
Biology Resources) or the restriction endonuclease Aci I which has as a 
recognition sequence CCGC. 

Non-radioactive labeling of DNA using TCL is accomplished 
10 by mixing: 10 pg - 100 ng linearized template, 50 ng Cv/'jr-digested primers, 
1.5 (i\ 10X labeling buffer, 2.5 - 5 units thermostable DNA polymerase, 1 /x\ 
of ImM Biotin-1 1-dUTP (Enzo Diagnostics, New York, New York), 1.5 fxl 
each of dATP, dCTP, and dGTP (2 mM), and 1.0 fxl 2mM dTTP. The 
reaction mixture is brought to a volume of 15 /il with deionized H 2 0, overlaid 

15 with mineral oil and cycled through 20 rounds of denaturation, annealing and 
extension. A typical cycling regimen employs 20 cycles of denaturation at 
91 °C for 5 sec, annealing at 50°C for 5 sec and extension at 72°C for 30 sec. 
The reaction is then terminated by adding 1 pi of 0.5M EDTA, pH 8.0. The 
amplified, labeled probe is a very heterogeneous mixture of fragments, which 

20 appears as a smear when analyzed by agarose gel electrophoresis. 

The performance of recombinant Tfl DNA pol I holoenzyme, 
Tfl exo" fragment, Taq holoenzyme, and the Stoffel fragment (control) was 
assayed by substitution of these enzymes for the enzyme provided with the 
CHIMERx TCL kit (ZEPTO™ Labeling Kit). Five units of each enzyme and 

25 biotin-1 1-dUTP as the label were used. The substrate was pUC19 DNA. 

After cycling of the samples the relative efficiency of the 
labeling reaction was determined by electrophoresis on a 0.7% agarose gel. 
The ethidium bromide gel staining of amplified DNA shows the characteristic 
smear for all enzymes used. The efficiency of incorporation was then 

30 determined by dot blot analysis. 
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The hybridized and developed filter showed that the 
holoenzymes (native and recombinant Tfl, and Taq) can be diluted 1: 10 6 and 
still generate a visible dot. The samples which were labeled with Tfl exo" or 
the Stoffel fragment can clearly be seen after a 1: 10* dilution. The 1: 10 s 
5 dilution gave a weak signal when the exo" fragment was used for the TCL 
reaction. 

Another aspect of the present invention involves a variation of 
TCL called Universal Thermal Cycle Labeling (UTCL) in which the extension 
primers are not supplied by CviJI* restriction. Without intending to be 

10 limited to a particular theory, one explanation for the mechanism of UTCL is 
that the Tfl DNA pol I holoenzyme itself may supply endogenous "random" 
primers for enzymatic extension in a TCL-type reaction. Alternatively, some 
other explanation accounts for the mechanism of UTCL. 

In a UTCL reaction, recombinant Tfl DNA pol 1 holoenzyme 

15 is combined with intact DNA template and is subjected to repeated cycles of 
denaturation, annealing, and extension. A radioactive- or non-isotopically- 
labeled deoxynucleotide triphosphate is incorporated during the extension step 
for subsequent detection purposes. The amplified, labeled probe represents 
a very heterogenous mixture of fragments, which appears as a large molecular 

20 weight smear when analyzed by agarose gel electrophoresis. The utility of 
recombinant Tfl DNA pol I for Universal Thermal Cycle Labeling is 
demonstrated by substituting this enzyme in the UTCL protocol described in 
co-owned, copending U.S. Patent App. Ser. No. 08/217,459, filed March 24, 
• 1994 (Example 12), incorporated herein by reference. 

25 EXAMPLE 14 

Reverse Transcription with 
T. flavus DNA Pol I Holoenzyme and exo Fragment 

RNA-dependent DNA polymerase activity of the Tfl DNA 

polymerases was analyzed using the following procedure: In a 0.5 ml reaction 
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tube, 2.5 /il 1 M Tris-HCl, pH 8.3, 5jtl of 0.6 M KC1, 5 fil of 0.04 M 
MgCl 2 , 17.5 /xl of water, 10^1 of 2 mM poly rA:dT (the substrate) and 5 
5 mM [a- 32 P]TTP at 10 fiCi/ml were combined. After incubation at 55 °C for 
5 min., the reaction was started by the addition of 5 /xl of enzyme (Five DNA- 
5 dependent DNA polymerase units per /xl). The reaction was allowed to 
proceed at 55°C for 30 min., and terminated by taking a 40 ,ul aliquot and 
adding it to 50 jd of 10% tRNA, 2% sodium pyrophosphate. The samples 
were precipitated with TCA and the enzyme activity was determined as 
described above. The RNA-dependent polymerase activity of the native and 

10 the recombinant Tfl DNA pol I was determined to be about 6% of the 
DNA -dependent polymerase activity. When 10 (RNA-dependent DNA 
polymerase) units of AMV-RT (Molecular Biology Resources, Inc., Cat. No. 
1372-01) were compared to 10 (DNA-dependent DNA polymerase) units of 
Tfl DNA pol I it was found that nTfl DNA pol I possess 2.4% and rTfl DNA 

15 pol I 1.6% of the RNA-dependent DNA polymerase activity of AMV-RT. 
Titration of the MgCl 2 and the MnCl 2 concentration revealed that the native 
and the recombinant holoenzymes prefer MgCl 2 over MnCl 2 for RT activity. 

The Tfl exo fragment has a lower RT activity than the 
holoenzyme, but has a broader temperature range for activity. First strand 
20 cDNA synthesis with the holoenzymes apparently yields a product of the same 
length as that obtained by using AMV-RT. The recombinant T. flavus DNA 
polymerase I and the exo - fragment both exhibit reverse transcriptase function 
which can be used in applications such as RT-PCR or cDNA preparation at 
elevated temperatures. 

25 EXAMPLE 15 

Comparison of the Processivity 
of DNA Polymerase Enzymes 

Using a modification of a procedure described by Tabor et al., 

J. Biol Chem. 262: 16212-16223 (1987), the processivity of native and 
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recombinant Tfl DNA pol I holoenzyme, Tfl exo" fragment, and Taq DNA pol 
I holoenzyme were compared. The "processivity" of a DNA polymerase 
enzyme is a measure of the rate at which the enzyme moves forward along a 
template while catalyzing DNA synthesis, i.e., a measure of the speed at 
5 which DNA polymerization takes place in the presence of the enzyme. 

To prepare the assay, a 60 id reaction cocktail was prepared 
with 3/xg M13 mpl8 ssDNA, 12 id ddATP mix (20 iiM dATP; 60 nM each 
of dCTP, dGTP, and dTTP; 300 ddATP), 3.0 /xl <*- 33 P labeled forward 
sequencing primer (3 ng/^l), 12/ul 5x reaction buffer (250 mM Tris-HCl, pH 
10 9.5; 12.5 mM MgCl 2 ), balance H 2 0. Additionally, dilutions of native and 
recombinant Tfl holoenzyme, Tfl exo fragment, and Taq holoenzyme were 
prepared with appropriate storage buffer to create enzyme solutions of 0. 125 
and 0.0125 units/id for the holoenzymes and 0.25 and O.025 units/ml for Tfl 
exo fragment. 

15 To perform the assay, 7.0 fil of the reaction cocktail were 

mixed with 2.0 /il of diluted DNA polymerase enzyme. By using 0.25 and 
0.025 units of Taq, nTfl, and rTfl holoenzyme and 0.5 and 0.05 units of exo 
fragment per reaction, reactions containing approximately 1:100 and 1:1000 
enzyme molecule: template molecule are obtained. The use of such low 

20 polymerase concentrations minimizes the "bumping off from template by 
competing polymerase molecules. Reaction mixtures were incubated 

at 65°C and 3/xl samples were removed at 1.0, 2.5 and 6.0 minute time 
points. Reactions were stopped by adding 1.0 ^1 stop buffer 
(EDTA/DTT/BromoPhenol Blue/xylene cyanol), were heated at 90 °C for 3 

25 min., and were loaded onto 7.5% polyacrylamide sequencing gels. The gels 
were electrophoresed until the bromophenol blue dye was about 3/4 down the 
gel, and an autoradiograph of the gel was taken overnight at -70 °C. 

With this assay, a highly processive enzyme produces stong, 
slow-mobility (larger) labeled bands on an autoradiograph, whereas a less 

30 processive DNA polymerase produces higher-mobility (smaller) fragments 
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and/or bands with less intensity. Autoradiographs of the 6 min. 
incubation/ 1:100 enzyme: template reactions revealed the exo fragment 
produced bands with the most intensity, followed by the rTfl and nTfl 
holoenzyme, then the Taq holoenzyme. The length of fragments obtained by 
5 the four enzymes was very comparable. Autoradiographs from the 1 : 1000 
enzyme: template reaction indicate that processivity (from best to least) is Tfl 
exo fragment > Taq holoenzyme > nTfl holoenzyme and rTfl holoenzyme. 
These results indicate that Tfl exo" fragment has greater processivity than 
either Tfl holoenzyme (native or recombinant) or Taq holoenzyme. 

10 EXAMPLE 16 

Large-Scale Purification of Recombinant Tfl 
DNA Polymerase I Holoenzyme and Exo Fragment 

Both the recombinant Tfl holoenzyme and Tfl exo fragment 

were purified on a large "production" scale by modifying the procedure 

15 described above for purifying native Tfl holoenzyme. 

Four hundred sixty grams of induced E. coli DH5orF'IQ cells 
transformed with pTFLRT4 (cultured and frozen as described above) were 
thawed and suspended in 2500 ml of lysis buffer A (20 mM Tris-HCl, pH 8; 
0.5 mM EDTA; 7 mM £-mercaptoethanol; 10 mM MgCl 2 ). For Tfl exo 

20 fragment, 787 grams of E. coli transformed with p21EHcMl. 1 (cultured and 
frozen as described above) were used. Phenylmethylsulfonyl fluoride (PMSF) 
was added to a Final concentration of 0.3 mM. 

The suspension was then treated with 0.2 g/1 of lysozyme 
(predissoived in lysis buffer) at 4°C for 1 hr. Cells were homogenized twice 

25 at 9000 psi in a Manton Gaulin homogenizer, with the suspension chilled to 
approximately 10°C between passes. New PMSF was added to 0.2 g/1 
before, between and after passes. The suspension of lysed cells was divided 
into 300 ml portions, heated to 65 °C for I hr., cooled down to 4°C, and 
centrifuged for 30 min. at 13,500 x g. 
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Following the centrifugation, NaCl and polyethyleneimine (PEI) 
(10% w/v, pH 7.0) were added to the heat-treated supernatant to a final 
concentration of 0.5 M and to 0. 1 % , respectively. The sample was mixed 
well and centrifuged at 13,500 x g for 1 hour. 
5 The supernatant from the twice-centrifuged, heat-treated lysate 

was desalted by diluting with 10 liters of DE52 column buffer (20 rnM Tris- 
HC1, pH 8.0; 0.5 mM EDTA; 7 mM j8ME) and concentrated to 
approximately 4 liters using an Amicon S10Y30 Spiral Ultrafiltration 
cartridge. The dilution/concentration step was repeated two times, with a 

10 final concentrated volume of about 4 liters. 

The desalted sample was batch contacted with 400 g of 
equilibrated Whatman DE52 ion exchange resin (Maidstone, England). The 
suspension was collected on a sintered glass funnel and washed 3 times with 
1 volume of DE52 column buffer. The resin was then resuspended in a 

15 minimal volume of buffer and poured into a column (4.5 x 50 cm), packed 
and washed with an additional volume of buffer. The column was eluted with 
a 0-0.5 M NaCl linear gradient (total gradient volume: 2000 ml). Twenty- 
five ml fractions were collected at a rate of about 5 ml/min. Peak fractions 
(fractions containing DNA polymerase activity) were determined by a 

20 modified DNA polymerase assay described by Kaledin et al., Biokhimiya 
45:644-651 (1980), pooled and dialyzed in approximately twenty-five volumes 
of Affi-Gel Blue (AGB) column buffer (20 mM Tris-HCl, pH 7.5; 0.5 mM 
EDTA; 10 mM /3ME; 10 mM MgCl 2 ; 0.02% Brij 35). 

The dialyzed DE52 peak fractions were applied to an AGB 

25 column (4.4 x 40 cm, 600 ml packed volume, MBR Blue, Molecular Biology 
Resources, Milwaukee WI), which was washed with 2 column volumes of 
AGB column buffer, and eluted with a 0-1.2 M NaCl linear gradient (total 
gradient volume: 2000 ml). To elute the exo" fragment, a 0-1.5 M NaCl 
linear gradient was employed. Twenty-five ml fractions were collected at a 
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rate of 1-5 ml/min. The peak fractions were dialyzed as above in AGB 
buffer. 

The dialyzed AGB peak fractions were applied to a Heparin 
Agarose column (4.4 x 16.5 cm, 250 ml packed volume (Bio-Rad Affigel 
5 Heparin or Heparin Agarose from Molecular Chimerics, Madison, WI)), 
which was washed with approximately 2 column volumes (until effluent is no 
longer colored, and column resin is white in appearance), and eluted with a 
0. 1-1.0 M NaCI linear gradient (total gradient volume: 1500 ml). To elute 
the exo fragment, a 0.15- 1.0 M NaCI linear gradient was employed. Twenty- 

10 five ml fractions were collected at a rate of 1-5 ml/min. The peak fractions 
were dialyzed in HP Q Sepharose Column Buffer (20 mM Tris-HCl, pH 7.5; 
0.5 mM EDTA; 7 mM £ME; 0.1% Brij 35). 

The dialyzed heparin agarose peak fractions were filtered 
through a 0.2 /im filter and applied at 4 ml/min. to the HP Q Sepharose 

15 column (Pharmacia, Uppsala, Sweden) on FPLC. The column was washed 
with several column volumes of buffer, and eluted with a 0-0.25 M NaCI 
linear gradient. Ten mi fractions were collected at 4 ml/minute. The peak 
fractions were dialyzed in HP S Column Buffer (20 mM Na-Citrate, pH 6.0; 
1 mM EDTA; 7 mM ,3ME; 0.1% Brij 35) or diluted in the same buffer, 

20 depending on the volume of the fraction pool. 

The dialyzed (or diluted) HP Q peak fractions were filtered 
through a 0.2 ixm filter and the HP S column (Pharmacia) was run as above, 
washing with HP S Column buffer and eluting with a 0-0.25 M NaCI 
gradient. Peak fractions were pooled and dialyzed against 4 liters of Final 

25 Storage Buffer (50 mM Tris-HCl, pH 7.5, 0. 1 mM EDTA, 5 mM DTT, 50% 
glycerol). The final product was diluted to a concentration of 5000 U/ml in 
the above buffer including 0.5 % Tween 20 (Sigma Chemical Co., St. Louis, 
MO) and 0.5 % Nonidet P40 (Fluka Biochemika, Buchs, Switzerland) as 
stabilizers and stored at -20°C. To purify the recombinant holoenzyme, the 

30 HP S column purification was unnecessary, and therefore was omitted. 
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Samples from the above-described preparations were 
electrophoreses using SDS-Page and visualized with silver staining. The rTfl 
holoenzyme and exo' fragments appeared as single bands having apparent 
molecular weights of 88,000 and 63,000 kDa, respectively, each being greater 
5 than 95% pure. A quantitative analysis of the enzymes prepared using the 
above-described purification procedure is as shown in Table 6: 



TABLE 6 


Enzvme 


Quantity of 
Cells 


Specific Activity 
(Units/mg protein) 


Yield 

(Units/g ceils) 


nTfl Holo 
(Example 1) 


1200 g 


50,000 U/mg 


1,700 U/g 


rTfl Holo 


460 g 


70,000 U/mg 


4,300 U/g 


rTfl exo 


787 g 


192,000 U/mg 


5.600 U/g 



The biological activities of the recombinant enzymes purified 
by the above-described protocol were analyzed using the assays described in 
15 preceding Examples. In the endonuclease activity assay described in Example 
9, five, ten, and twenty unit challenges resulted in less than 5% conversion 
of supercoiled pBR322 to the linear form. The results of other assays 
described in Example 9 are summarized in Table 7: 



TABLE 7 


ASSAY 


rTfl holo 


Tfl Exo 


ds DNase 


0% slope/unit 


0% slope/unit 


ss DNase 


0% slope/unit 


0% slope/unit 


3' Exonuclease 


0% slope/unit 


0.06% slope/unit 


5 * Exonuclease 


0.48% slope/unit 


0% slope/unit 
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Deposit of Biological Materials: The following piasmids have 
been deposited with the American Type Culture Collection (ATCC), 12301 
Parklawn Dr., Rockville MD 20852 (USA) pursuant to the provisions of the 
Budapest Treaty: 



5 



1 Designation 


Deposit Date 


ATCC No. 


Host Strain 


| pTFLRT4 


May 26, 1994 


69633 


DH5aFTQ 


1 P21EHcMl.l 


May 26, 1994 


69632 


DH5c*F' 



Availability of the deposited materials is not to be construed as a license to 
practice the invention in contravention of the rights granted under the 

10 authority of any government in accordance with its patent laws. 

The present invention has been described with reference to 
specific examples and embodiments. However, this application is intended to 
cover those changes and substitutions which are apparent and may be made by 
those skilled in the art without departing from the spirit and scope of the 

15 claims. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: Molecular Biology Resources, Inc. 

(B) STREET: 5520 W- Burleigh Street 
<C) CITY: Milwaukee 

(D) STATE: Wisconsin 

(E) COUNTRY: United States of America 

(F) POSTAL CODE: 53210 

(ii) TITLE OF INVENTION: Biologically Active Fragments of 
Thermua Flavus DNA Polymerase 

(iii) NUMBER OF SEQUENCES: 51 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE : Marshall, O'Toole, Gerstein, Murray & Borun 

(B) STREET: 6300 Sears Tower, 233 South Wacker Drive 

(C) CITY: Chicago 

(D) STATE: Illinois 

(E) COUNTRY: United States of America 

(F) POSTAL CODE: 60606-6402 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

<B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

<D) SOFTWARE: Patentln Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 
(A) APPLICATION NUMBER: 
{ B ) FILING DATE: 
(C) CLASSIFICATION: 

(viii) ATTORNEY /AGENT INFORMATION : 
(A) NAME: Gass, David A. 
<B) REGISTRATION NUMBER: 38,153 
(C) REFERENCE /DOCKET NUMBER: 28003/31716 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 312/474-6300 

(B) TELEFAX: 312/474-0448 

(C) TELEX: 25-3856 



(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3048 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 301.. 2805 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

TACTTCGGCG GGGTGAAGCT CGGGGCCGGG GGGCTTGTGC GGGCCTACGG GGGGGTGGCG 60 

GCGGAGGCCT TAAGCGGGCG CCCAAGGTCC CCTTGGTGGA GCGGGTGGGG CTCGCCTTCC 120 

TCGTGCCCTT CGCCGAGGTG GGCCGGGTCT ACGCCCTCCT GGAGGCCCGC GCCCTGAAGG 180 

CCGAGGAGAC CTACACCCCG GAGGGCGTGC GCTTCGCCCT CCTCCTCCCC AAGCCCGAGC 240 

GGGAAGGTTT CCTCAGGGCG CTCCTGGACG CCACCCGGGG ACAGGTGGCC CTGGAGTAGC 300 

ATG GAG GCG ATC GTT CCG CTC TTT GAA CCC AAA GGC CGG GTC CTC CTG 3 48 

Met Glu Ala lie Val Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu 
1 5 10 15 

GTG GAC GGC CAC CAC CTG GCC TAC CGC ACC TTC TTC GCC CTG AAG GGC 396 
Val Asp Gly His His Leu Ala Tyr Arg Thr Phe Phe Ala Leu Lys Gly 
20 25 30 

CTC ACC ACG AGC CGG GGC GAA CCG GTG CAG GCG GTC TAC GGC TTC GCC 444 
Leu Thr Thr Ser Arg Gly Glu Pro Val Gin Ala Val Tyr Gly Phe Ala 
35 40 4 5 

AAG AGC CTC CTC AAG GCC CTG AAG GAG GAC GGG TAC AAG GCC GTC TTC 492 
Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Tyr Lys Ala Val Phe 
50 55 60 

GTG GTC TTT GAC GCC AAG GCC CCC TCC TTC CGC CAC GAG GCC TAC GAG 540 
Val Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Glu 
65 70 75 80 

GCC TAC AAG GCG GGG AGG GCC CCG ACC CCC GAG GAC TTC CCC CGG CAG 5 88 

Ala Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gin 
85 90 95 

CTC GCC CTC ATC AAG GAG CTG GTG GAC CTC CTG GGG TTT ACC CGC CTC 636 
Leu Ala Leu He Lys Glu Leu Val Asp Leu Leu Gly Phe Thr Arg Leu 
100 105 110 

GAG GTC CCC GGC TAC GAG GCG GAC GAC GTC CTC GCC ACC CTG GCC AAG 684 
Glu Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Thr Leu Ala Lys 
115 120 125 

AAG GCG GAA AAG GAG GGG TAC GAG GTG CGC ATC CTC ACC GCC GAC CGC 7 32 

Lys Ala Glu Lys Glu Gly Tyr Glu Val Arg lie Leu Thr Ala Asp Arg 
130 135 140 

GAC CTC TAC CAA CTC GTC TCC GAC CGC GTC GTC GTC CTC CAC CCC GAG 7 80 

Asp Leu Tyr Gin Leu Val Ser Asp Arg Val Val Val Leu His Pro Glu 
145 150 155 160 

GGC CAC CTC ATC ACC CCG GAG TGG CTT TGG GAG AAG TAC GGC CTC AAG 82 8 

Gly His Leu He Thr Pro Glu Trp Leu Trp Glu Lys Tyr Gly Leu Lys 
165 170 175 

CCG GAG CAG TGG GTG GAC TTC CGC GCC CTC GTG GGG GAC CCC TCC GAC 87 6 

Pro Glu Gin Trp Val Asp Phe Arg Ala Leu Val Gly Asp Pro Ser Asp 



r 



1 
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AAC CTC CCC GGG GTC AAG GGC ATC GGG GAG AAG ACC GCC CTC AAG CTC 
Asn Leu Pro Gly Val Lya Gly lie Gly Glu Lys Thr Ala Leu Lys Leu 
195 200 205 

CTC AAG GAG TGG GGA AGC CTG GAA AAC CTC CTC AAG AAC CTG GAC CGG 
Leu Lys Glu Trp Gly Ser Leu Glu Asn Leu Leu Lys Aan Leu Asp Arg 
210 215 220 

GTA AAG CCA GAA AAC GTC CGG GAG AAG ATC AAG GCC CAC CTG GAA GAC 
Val Lys Pro Glu Asn Val Arg Glu Lys He Lys Ala His Leu Glu Asp 
225 230 235 240 

CTC AGG CTT TCC TTG GAG CTC TCC CGG GTG CGC ACC GAC CTC CCC CTG 
Leu Arg Leu Ser Leu Glu Leu Ser Arg Val Arg Thr Asp Leu Pro Leu 
245 250 255 

GAG GTG GAC CTC GCC CAG GGG CGG GAG CCC GAC CGG GAG GGG CTT AGG 
Glu Val Asp Leu Ala Gin Gly Arg Glu Pro Asp Arg Glu Gly Leu Arg 
260 265 270 

GCC TTC CTG GAG AGG CTG GAG TTC GGC AGC CTC CTC CAC GAG TTC GGC 
Ala Phe Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly 
275 280 285 

CTC CTG GAG GCC CCC GCC CCC CTG GAG GAG GCC CCC TGG CCC CCG CCG 
Leu Leu Glu Ala Pro Ala Pro Leu Glu Glu Ala Pro Trp Pro Pro Pro 
290 295 300 

GAA GGG GCC TTC GTG GGC TTC GTC CTC TCC CGC CCC GAG CCC ATG TGG 
Glu Gly Ala Phe Val Gly Phe Val Leu Ser Arg Pro Glu Pro Met Trp 
305 310 315 320 

GCG GAG CTT AAA GCC CTG GCC GCC TGC AGG GAC GGC CGG GTG CAC CGG 
Ala Glu Leu Lys Ala Leu Ala Ala Cys Arg Asp Gly Arg Val His Arg 
325 330 335 

GCA GCA GAC CCC TTG GCG GGG CTA AAG GAC CTC AAG GAG GTC CGG GGT 
Ala Ala Asp Pro Leu Ala Gly Leu Lys Asp Leu Lys Glu Vai Arg Gly 
340 345 350 

CTC CTC GCC AAG GAC CTC GCC GTC TTG GCC TCG AGG GAG GGG CTA GAC 
Leu Leu Ala Lys Asp Leu Ala Val Leu Ala Ser Arg Glu Gly Leu ABp 
355 360 365 

CTC GTG CCC GGG GAC GAC CCC ATG CTC CTC GCC TAC CTC CTG GAC CCC 
Leu Val Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro 
370 375 380 

TCC .AAC ACC ACC CCC GAG GGG GTG GCG CGG CGC TAC GGG GGG GAG TGG 
Ser Asn Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp 
385 390 395 400 

ACG GAG GAC GCC GCC CAC CGG GCC CTC CTC TCG GAG AGG CTC CAT CGG 
Thr Glu Asp Ala Ala His Arg Ala Leu Leu Ser Glu Arg Leu His Arg 
405 410 415 

AAC CTC CTT AAG CGC CTC GAG GGG GAG GAG AAG CTC CTT TGG CTC TAC 
Asn Leu Leu Lys Arg Leu Glu Gly Glu Glu Lys Leu Leu Trp Leu Tyr 
420 425 430 
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CAC GAG GTG GAA AAG CCC CTC TCC CGG GTC CTG GCC CAC ATG GAG GCC 
His Glu Val Glu Lys Pro Leu Ser Arg Val Leu Ala Hia Met Glu Ala 
435 440 445 

ACC GGG GTA CGG CTG GAC GTG GCC TAC CTG CAG GCC CTT TCC CTG GAG 
Thr Gly Val Arg Leu Asp Val Ala Tyr Leu Gin Ala Leu Ser Leu Glu 
450 455 460 

CTT GCG GAG GAG ATC CGC CGC CTC GAG GAG GAG GTC TTC CGC TTG GCG 
Leu Ala Glu Glu lie Arg Arg Leu Glu Glu Glu Val Phe Arg Leu Ala 
465 470 475 480 

GGC CAC CCC TTC AAC CTC AAC TCC CGG GAC CAG CTG GAA AGG GTG CTC 
Gly His Pro Phe Asn Leu Asn Ser Arg Asp Gin Leu Glu Arg Val Leu 
485 490 495 

TTT GAC GAG CTT AGG CTT CCC GCC TTG GGG AAG ACG CAA AAG ACG GGC 
Phe Asp Glu Leu Arg Leu Pro Ala Leu Gly Lys Thr Gin Lys Thr Gly 
500 505 5 10 

AAG CGC TCC ACC AGC GCC GCG GTG CTG GAG GCC CTA CGG GAG GCC CAC 
Lys Arg Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His 
515 520 525 

CCC ATC GTG GAG AAG ATC CTC CAG CAC CGG GAG CTC ACC AAG CTC AAG 
Pro lie Val Glu Lys lie Leu Gin His Arg Glu Leu Thr Lys Leu Lys 
530 535 540 

AAC ACC TAC GTG GAC CCC CTC CCA AGC CTC GTC CAC CCG AGG ACG GGC 
Asn Thr Tyr Val Asp Pro Leu Pro Ser Leu Val His Pro Arg Thr Gly 
545 550 555 560 

CGC CTC CAC ACC CGC TTC AAC CAG ACG GCC ACG GCC ACG GGG AGG CTT 
Arg Leu His Thr Arg Phe Asn Gin Thr Ala Thr Ala Thr Gly Arg Leu 
565 570 575 

AGT AGC TCC GAC CCC AAC CTG CAG AAC ATC CCC GTC CGC ACC CCC TTG 
Ser Ser Ser Asp Pro Asn Leu Gin Asn lie Pro Val Arg Thr Pro Leu 
580 585 590 

GGC CAG AGG ATC CGC CGG GCC TTC GTG GCC GAG GCG GGA TGG GCG TTG 
Gly Gin Arg lie Arg Arg Ala Phe Val Ala Glu Ala Gly Trp Ala Leu 
595 600 605 

GTG GCC CTG GAC TAT AGC CAG ATA GAG CTC CGC GTC CTC GCC CAC CTC 
Val Ala Leu Asp Tyr Ser Gin lie Glu Leu Arg Val Leu Ala His Leu 
610 615 620 

TCC GGG GAC GAG AAC CTG ATC AGG GTC TTC CAG GAG GGG AAG GAC ATC 
Ser Gly Asp Glu Asn Leu lie Arg Val Phe Gin Glu Gly Lys Asp He 
625 630 63S 640 

CAC ACC CAG ACC GCA AGC TGG ATG TTC GGC GTC CCC CCG GAG GCC GTG 
His Thr Gin Thr Ala Ser Trp Met Phe Gly Val Pro Pro Glu Ala Val 
645 650 655 

GAC CCC CTG ATG CGC CGG GCG GCC AAG ACG GTG AAC TTC GGC GTC CTC 
Asp Pro Leu Met Arg Arg Ala Ala Lys Thr Val Aan Phe Gly Val Leu 
660 665 670 
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TAC GGC ATG TCC GCC CAT AGG CTC TCC CAG GAG CTT GCC ATC CCC TAC 2 364 

Tyr Gly Met Ser Ala His Arg Leu Ser Gin Glu Leu Ala lie Pro Tyr 
675 680 685 

GAG GAG GCG GTG GCC TTT ATA GAG CGC TAC TTC CAA AGC TTC CCC AAG 2412 
Glu Glu Ala Val Ala Phe lie Glu Arg Tyr Phe Gin Ser Phe Pro Lys 
690 695 700 

GTG CGG GCC TGG ATA GAA AAG ACC CTG GAG GAG GGG AGG AAG CGG GGC 2 4 60 

Val Arg Ala Trp lie Glu Lys Thr Leu Glu Glu Gly Arg Lys Arg Gly 
705 710 715 720 

TAC GTG GAA ACC CTC TTC GGA AGA AGG CGC TAC GTG CCC GAC CTC AAC 2 508 

Tyr Val Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asd Leu Aan 
725 730 735 

GCC CGG GTG AAG AGC GTC AGG GAG GCC GCG GAG CGC ATG GCC TTC AAC 2556 
Ala Arg Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn 
740 745 750 

ATG CCC GTC CAG GGC ACC GCC GCC GAC CTC ATG AAG CTC GCC ATG GTG 2 604 

Met Pro Val Gin Gly Thr Ala Ala Asp Leu Met Lya Leu Ala Met Val 
755 760 765 

AAG CTC TTC CCC CGC CTC CGG GAG ATG GGG GCC CGC ATG CTC CTC CAG 2 65 2 

Lys Leu Phe Pro Arg Leu Arg Glu Met Gly Ala Arg Met Leu Leu Gin 
770 775 780 

GTC CAC GAC GAG CTC CTC CTG GAG GCC CCC CAA GCG CGG GCC GAG GAG 2 700 

Val His Asp Glu Leu Leu Leu Glu Ala Pro Gin Ala Arg Ala Glu Glu 
785 790 795 800 

GTG GCG GCT TTG GCC AAG GAG GCC ATG GAG AAG GCC TAT CCC CTC GCC 2 74 8 

Val Ala Ala Leu Ala Lys Glu Ala Met Glu Lys Ala Tyr Pro Leu Ala 
805 810 815 

GTG CCC CTG GAG GTG GAG GTG GGG ATG GGG GAG GAC TGG CTT TCC GCC 2 796 

Val Pro Leu Glu Val Glu Val Gly Met Gly Glu Asp Trp Leu Ser Ala 
820 825 830 

AAG GGT TAGGGGGGCC CTG CCGTTTA GAGGAAGTTC AAGGGGTTGT CCCTCAGAAA 2852 
Lys Gly 

CGCCTCCAGG GGAACGCCCT CTGCGGCTAC CAGGAGGCCT TTAGCCCCAA AGGTGCGGGT 2 912 

GAAGGCTTCC AGGCCCTGGG TTCTTTTAAA GGGGGCGCTT TTGACCTCGA GGGCCAGGAG 2 97 2 

GCGCTTTCCC TTTTGAAGGA CAAAGTCACT TCCTGGTCCC TTTCCCGCCA G TAG T A CAC C 3032 

TCAAACCCCC CCTGGT 3 04 8 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 83 4 amino acids 

(B) TYPE: amino acid 
<D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
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Met Glu Ala lie Val Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu 
IS 10 15 

Val Asp Gly Hia His Leu Ala Tyr Arg Thr Phe Phe Ala Leu Lys Gly 
20 25 30 

Leu Thr Thr Ser Arg Gly Glu Pro Val Gin Ala Val Tyr Gly Phe Ala 
35 40 45 

Lys Ser Leu Leu Lya Ala Leu Lys Glu Asp Gly Tyr Lys Ala Val Phe 
50 55 " 60 

Val Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Glu 
65 70 75 80 

Ala Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gin 
85 90 95 

Leu Ala Leu lie Lys Glu Leu Val Asp Leu Leu Gly Phe Thr Arg Leu 
100 105 110 

Glu Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Thr Leu Ala Lys 
115 120 125 

Lys Ala Glu Lys Glu Gly Tyr Glu Val Arg lie Leu Thr Ala Asp Arg 
130 135 140 

Asp Leu Tyr Gin Leu Val Ser Asp Arg Val Val Val Leu His Pro Glu 
145 150 155 160 

Gly His Leu lie Thr Pro Glu Trp Leu Trp Glu Lys Tyr Gly Leu Lys 
165 170 175 

Pro Glu Gin Trp Val Asp Phe Arg Ala Leu Val Gly Asp Pro Ser Asp 
180 185 190 

Asn Leu Pro Gly Val Lys Gly lie Gly Glu Lys Thr Ala Leu Lys Leu 
195 200 205 

Leu Lys Glu Trp Gly Ser Leu Glu Asn Leu Leu Lys Asn Leu Asp Arg 
210 215 220 

Val Lys Pro Glu Asn Val Arg Glu Lys lie Lys Ala His Leu Glu Asp 
225 230 235 240 

Leu Arg Leu Ser Leu Glu Leu Ser Arg Val Arg Thr Asp Leu Pro Leu 
245 250 255 

Glu Val Asp Leu Ala Gin Gly Arg Glu Pro Asp Arg Glu Gly Leu Arg 
260 265 270 

Ala Phe Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly 
275 280 285 

Leu Leu Glu Ala Pro Ala Pro Leu Glu Glu Ala Pro Trp Pro Pro Pro 
290 295 300 

Glu Gly Ala Phe Val Gly Phe Val Leu Ser Arg Pro Glu Pro Met Trp 
305 310 315 320 

Ala Glu Leu Lys Ala Leu Ala Ala Cys Arg Asp Gly Arg Val His Arg 
325 330 335 
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Ala Ala Asp Pro Leu Ala Gly Leu Lya Asp Leu Lya Glu Val Arg Gly 

340 345 350 

Leu Leu Ala Lys Asp Leu Ala Val Leu Ala Ser Arg Glu Gly Leu Asp 

355 360 365 

Leu Val Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro 

370 375 380 

Ser Asn Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp 

385 390 ~ 395 ' 400 

Thr Glu Asp Ala Ala His Arg Ala Leu Leu Ser Glu Arg Leu His Arg 

405 410 415 



His Glu Val Glu Lys Pro Leu Ser Arg Val Leu Ala His Met Glu Ala 
435 440 445 

Thr Gly Val Arg Leu Asp Val Ala Tyr Leu Gin Ala Leu Ser Leu Glu 
450 455 460 

Leu Ala Glu Glu lie Arg Arg Leu Glu Glu Glu Val Phe Arg Leu Ala 
465 470 475 480 

Gly His Pro Phe Asn Leu Asn Ser Arg Asp Gin Leu Glu Arg Val Leu 
485 490 495 

Phe Asp Glu Leu Arg Leu Pro Ala Leu Gly Lys Thr Gin Lys Thr Gly 
500 505 510 

Lys Arg Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His 
515 520 525 

Pro He Val Glu Lys He Leu Gin His Arg Glu Leu Thr Lys Leu Lys 
530 535 540 

Asn Thr Tyr Val Asp Pro Leu Pro Ser Leu Val His Pro Arg Thr Gly 
545 550 555 560 

Arg Leu His Thr Arg Phe Aan Gin Thr Ala Thr Ala Thr Gly Arg Leu 
. 565 570 575 

Ser Ser Ser Asp Pro Asn Lt>u Gin Asn He Pro Val Arg Thr Pro Leu 
580 585 590 

Gly Gin Arg He Arg Arg Ala Phe Val Ala Glu Ala Gly Trp Ala Leu 
595 600 605 

Val Ala Leu Asp Tyr Ser Gin He Glu Leu Arg Val Leu Ala His Leu 
610 615 620 

Ser Gly Asp Glu Asn Leu He Arg Val Phe Gin Glu Gly Lys Asp He 
625 630 635 640 

His Thr Gin Thr Ala Ser Trp Met phe Gly Val Pro Pro Glu Ala Val 
645 650 655 

Asp Pro Leu Met Arg Arg Ala Ala Lys Thr Val Asn Phe Gly Val Leu 
660 665 670 
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Tyr Gly Met. Ser Ala His Arg Leu Ser Gin Glu Leu Ala lie Pro Tyr 
6 75 680 685 

Glu Glu Ala Val Ala Phe lie Glu Arg Tyr Phe Gin Ser Phe Pro Lys 
690 695 ' 700 

Val Arg Ala Trp lie Glu Lys Thr Leu Glu Glu Gly Arg Lys Arg Gly 

705 710 715 720 

Tyr Val Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Aap Leu Asn 
725 730 735 

Ala Arg Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn 
740 745 750 

Met Pro Val Gin Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val 
755 760 765 

Lys Leu Phe Pro Arg Leu Arg Glu Met Gly Ala Arg Met Leu Leu Gin 
770 775 780 

Val His Asp Glu Leu Leu Leu Glu Ala Pro Gin Ala Arg Ala Glu Glu 
785 790 795 800 

Val Ala Ala Leu Ala Lys Glu Ala Met Glu Lys Ala Tyr Pro Leu Ala 
805 810 815 

Val Pro Leu Glu Val Glu Val Gly Met Gly Glu Asp Trp Leu Ser Ala 
820 825 830 



{2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1794 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..1794 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

ATG GAA GAC CTC AGG CTT TCC TTG GAG CTC TCC CGG GTG CGC ACC GAC 
Met Glu Asp Leu Arg Leu Ser Leu Glu Leu Ser Arg Val Arg Thr Asp 
1 5 10 15 

CTC CCC CTG GAG GTG GAC CTC GCC CAG GGG CGG GAG CCC GAC CGG GAG 
Leu Pro Leu Glu Val Asp Leu Ala Gin Gly Arg Glu Pro Asp Arg Glu 
20 25 30 

GGG CTT AGG GCC TTC CTG GAG AGG CTG GAG TTC GGC AGC CTC CTC CAC 
Gly Leu Arg Ala Phe Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His 
35 40 45 
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GAG TTC GGC CTC CTG GAG GCC CCC GCC CCC CTG GAG GAG GCC CCC TGG 

Glu Phe Gly Leu Leu Glu Ala Pro Ala Pro Leu Glu Glu Ala Pro Trp 
50 55 60 

CCC CCG CCG GAA GGG GCC TTC GTG GGC TTC GTC CTC TCC CGC CCC GAG 

Pro Pro Pro Glu Gly Ala Phe Val Gly Phe Val Leu Ser Arc Pro Glu 

65 70 75 80 

CCC ATG TGG GCG GAG CTT AAA GCC CTG GCC GCC TGC AGG GAC GGC CGG 

Pro Met Trp Ala Glu Leu Lys Ala Leu Ala Ala Cys Arg Asp Gly Arg 

85 90 9 5 

GTG CAC CGG GCA GCA GAC CCC TTG GCG GGG CTA AAG GAC CTC AAG GAG 

Val His Arg Ala Ala Asp Pro Leu Ala Gly Leu Lys Asp Leu Lys Glu 
100 105 110 

GTC CGG GGT CTC CTC GCC AAG GAC CTC GCC GTC TTG GCC TCG AGG GAG 

Val Arg Gly Leu Leu Ala Lys Asp Leu Ala Val Leu Ala Ser Arg Glu 
115 120 125 

GGG CTA GAC CTC GTG CCC GGG GAC GAC CCC ATG CTC CTC GCC TAC CTC 

Gly Leu Asp Leu Val Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu 
130 135 140 

CTG GAC CCC TCC AAC ACC ACC CCC GAG GGG GTG GCG CGG CGC TAC GGG 

Leu Asp Pro Ser Asn Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly 

145 150 155 160 

GGG GAG TGG ACG GAG GAC GCC GCC CAC CGG GCC CTC CTC TCG GAG AGG 

Gly Glu Trp Thr Glu Asp Ala Ala His Arg Ala Leu Leu Ser Glu Arg 

165 170 175 

CTC CAT CGG AAC CTC CTT AAG CGC CTC GAG GGG GAG GAG AAG CTC CTT 

Leu His Arg Asn Leu Leu Lys Arg Leu Glu Gly Glu Glu Lys Leu Leu 
180 185 190 

TGG CTC TAC CAC GAG GTG GAA AAG CCC CTC TCC CGG GTC CTG GCC CAC 

Trp Leu Tyr His Glu Val Glu Lys Pro Leu Ser Arg Val Leu Ala His 
195 200 205 

ATG GAG GCC ACC GGG GTA CGG CTG GAC GTG GCC TAC CTG CAG GCC CTT 

Met Glu Ala Thr Gly Val Arg Leu Asp Val Ala Tyr Leu Gin Ala Leu 
210 215 220 

TCC CTG GAG CTT GCG GAG GAG ATC CCC CGC CTC GAG GAG GAG GTC TTC 

Ser Leu Glu Leu Ala Glu Glu lie Arg Arg Leu Glu Glu Glu Val Phe 

225 230 235 240 

CGC TTG GCG GGC CAC CCC TTC AAC CTC AAC TCC CGG GAC CAG CTG GAA 

Arg Leu Ala Gly His Pro Phe Asn Leu Asn Ser Arg Asp Gin Leu Glu 

245 250 255 

AGG GTG CTC TTT GAC GAG CTT AGG CTT CCC GCC TTG GGG AAG ACG CAA 

Arg Val Leu Phe Asp Glu Leu Arg Leu Pro Ala Leu Gly Lys Thr Gin 
260 265 270 

AAG ACG GGC AAG CGC TCC ACC AGC GCC GCG GTG CTG GAG GCC CTA CGG 

Lys Thr Gly Lys Arg Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg 
275 280 285 
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GAG GCC CAC CCC ATC GTG GAG AAG ATC CTC CAG CAC CGG GAG CTC ACC 
Glu Ala His Pro lie Val Glu Lys He Leu Gin His Arg Glu Leu Thr 
290 295 300 

AAG CTC AAG AAC ACC TAC GTG GAC CCC CTC CCA AGC CTC GTC CAC CCG 
Lys Leu Lys Asn Thr Tyr Val Asp Pro Leu Pro Ser Leu Val His Pro 
305 310 315 320 

AGG ACG GGC CGC CTC CAC ACC CGC TTC AAC CAG ACG GCC ACG GCC ACG 
Arg Thr Gly Arg Leu His Thr Arg Phe Aan Gin Thr Ala Thr Ala Thr 
325 330 335 

GGG AGG CTT AGT AGC TCC GAC CCC AAC CTG CAG AAC ATC CCC GTC CGC 
Gly Arg Leu Ser Ser Ser Asp Pro Aan Leu Gin Asn He Pro Val Arg 
340 345 350 

ACC CCC TTG GGC CAG AGG ATC CGC CGG GCC TTC GTG GCC GAG GCG GGA 
Thr Pro Leu Gly Gin Arg He Arg Arg Ala Phe Val Ala Glu Ala Gly 
355 360 365 

TGG GCG TTG GTG GCC CTG GAC TAT AGC CAG ATA GAG CTC CGC GTC CTC 
Trp Ala Leu Val Ala Leu Asp Tyr Ser Gin He Glu Leu Arg Val Leu 
370 375 380 

GCC CAC CTC TCC GGG GAC GAG AAC CTG ATC AGG GTC TTC CAG GAG GGG 
Ala His Leu Ser Gly Asp Glu Asn Leu He Arg Val Phe Gin Glu Gly 
385 390 395 400 

AAG GAC ATC CAC ACC CAG ACC GCA AGC TGG ATG TTC GGC GTC CCC CCG 
Lys Asp He His Thr Gin Thr Ala Ser Trp Met Phe Gly Val Pro Pro 
405 410 " 415 

GAG GCC GTG GAC CCC CTG ATG CGC CGG GCG GCC AAG ACG GTG AAC TTC 
Glu Ala Val Asp Pro Leu Met Arg Arg Ala Ala Lys Thr Val Asn Phe 
420 425 430 

GGC GTC CTC TAC GGC ATG TCC GCC CAT AGG CTC TCC CAG GAG CTT GCC 
Gly Val Leu Tyr Gly Met Ser Ala His Arg Leu Ser Gin Giu Leu Ala 
435 440 445 

ATC CCC TAC GAG GAG GCG GTG GCC TTT ATA GAG CGC TAC TTC CAA AGC 
He Pro Tyr Glu Glu Ala Val Ala Phe He Glu Arg Tyr Phe Gin Ser 
450 455 460 

TTC CCC AAG GTG CGG GCC TGG ATA GAA AAG ACC CTG GAG GAG GGG AGG 
Phe Pro Lys Val Arg Ala Trp He Glu Lys Thr Leu Glu Glu Gly Arq 
465 470 475 480 

AAG CGG GGC TAC GTG GAA ACC CTC TTC GGA AG A AGG CGC TAC GTG CCC 
Lys Arg Gly Tyr Val Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro 
485 490 495 

GAC CTC AAC GCC CGG GTG AAG AGC GTC AGG GAG GCC GCG GAG CGC ATG 
Asp Leu Asn Ala Arg Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met 
500 505 510 

GCC TTC AAC ATG CCC GTC CAG GGC ACC GCC GCC GAC CTC ATG AAG CTC 
Ala Phe Asn Met Pro Val Gin Gly Thr Ala Ala Asp Leu Met Lys Leu 
515 520 525 
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GCC ATG GTG AAG CTC TTC CCC CGC CTC CGG GAG ATG GGG GCC CGC ATG 
Ala Met Val Lys Leu Phe Pro Arg Leu Arg Glu Met. Gly Ala Arg Met 
530 535 540 

CTC CTC CAG GTC CAC GAC GAG CTC CTC CTG GAG GCC CCC CAA GCG CGG 
Leu Leu Gin Val His Aap Glu Leu Leu Leu Glu Ala Pro Gin Ala Arg 
545 S50 555 560 

GCC GAG GAG GTG GCG GCT TTG GCC AAG GAG GCC ATG GAG AAG GCC TAT 
Ala Glu Glu Val Ala Ala Leu Ala Lye Glu Ala Met Glu Lys Ala Tyr 
565 570 575 

CCC CTC GCC GTG CCC CTG GAG GTG GAG GTG GGG ATG GGG GAG GAC TGG 
Pro Leu Ala Val Pro Leu Glu Val Glu Val Gly Met Gly Glu Asp Trp 
580 585 590 

CTT TCC GCC AAG GGT TAG 
Leu Ser Ala Lys Gly 
595 

(2) INFORMATION FOR SEQ ID NO: 4 : 

(!) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 597 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 

Met Glu Asp Leu Arg Leu Ser Leu Glu Leu Ser Arg Val Arg Thr Asp 
1 5 10 15 

Leu Pro Leu Glu Val Asp Leu Ala Gin Gly Arg Glu Pro Asp Arg Glu 
20 25 30 

Gly Leu Arg Ala Phe Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His 
35 40 45 

Glu Phe Gly Leu Leu Glu Ala Pro Ala Pro Leu Glu Glu Ala Pro Trp 
50 55 60 

Pro Pro Pro Glu Gly Ala Phe Val Gly Phe Val Leu Ser i'.irg Pro Glu 
65 70 75 60 

Pro Met Trp Ala Glu Leu Lvs Ala Leu Ala Ala Cys Arg Asp Gly Arg 
85 90 95 

. Val His Arg Ala Ala Asp Pro Leu Ala Gly Leu Lys Asp Leu Lys Glu 
100 105 110 

Val Arg Gly Leu Leu Ala Lys Asp Leu Ala Val Leu Ala Ser Arg Glu 
115 120 125 

Gly Leu Asp Leu Val Pro Gly Asp ABp Pro Met Leu Leu Ala Tyr Leu 
130 135 140 

Leu Asp Pro Ser Asn Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly 
145 150 155 160 
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Gly Glu Trp Thr Glu Asp Ala Ala His Arg Ala Leu Leu Ser Glu Arg 
165 170 17S 

Leu Hia Arg Aan Leu Leu Lya Arg Leu Glu Gly Glu Glu Lys Leu Leu 
180 185 190 

Trp Leu Tyr Hia Glu Val Glu Lys Pro Leu Ser Arg Val Leu Ala Hia 
195 200 205 

Met Glu Ala Thr Gly Val Arg Leu Asp Val Ala Tyr Leu Gin Ala Leu 
210 215 220 

Ser Leu Glu Leu Ala Glu Glu lie Arg Arg Leu Glu Glu Glu Val Phe 
225 230 235 240 

Arg Leu Ala Gly Hia Pro Phe Asn Leu Asn Ser Arg Asp Gin Leu Glu 
245 250 * 255 

Arg Val Leu Phe Asp Glu Leu Arg Leu Pro Ala Leu Gly Lys Thr Gin 
260 265 270 

Lys Thr Gly Lye Arg Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg 
275 280 285 

Glu Ala His Pro He Val Glu Lys lie Leu Gin His Arg Glu Leu Thr 
290 295 300 

Lys Leu Lys Asn Thr Tyr Val Asd Pro Leu Pro Ser Leu Val His Pro 
305 310 315 320 

Arg Thr Gly Arg Leu His Thr Arg Phe Asn Gin Thr Ala Thr Ala Thr 
325 330 335 

Gly Arg Leu Ser Ser Ser Asp Pro Asn Leu Gin Asn He Pro Val Arg 
340 345 350 

Thr Pro Leu Gly Gin Arg lie Arg Arg Ala Phe Val Ala Glu Ala Gly 
355 360 365 

Trp Ala Leu Val Ala Leu Asp Tyr Ser Gin lie Glu Leu Arg Val Leu 
370 375 380 

Ala His Leu Ser Gly Asp Glu Asn Leu He Arg Val Phe Gin Glu Gly 
385 390 395 400 

Lys Asp He His Thr Gin Thr Ala Ser Trp Met Phe Gly Val Pro Pro 
405 410 " 415 

Glu Ala Val Asp Pro Leu Met Arg Arg Ala Ala Lys Thr Val Asn Phe 
420 425 430 

Gly Val Leu Tyr Gly Met Ser Ala His Arg Leu Ser Gin Glu Leu Ala 
435 440 445 

He Pro Tyr Glu Glu Ala Val Ala Phe He Glu Arg Tyr Phe Gin Ser 
450 455 460 

Phe Pro Lys Val Arg Ala Trp He Glu Lys Thr Leu Glu Glu Gly Arg 
465 470 475 480 

Lys Arg Gly Tyr Val Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro 
4S5 490 495 
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Asp Leu Aan Ala Arg Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met 
500 505 510 

Ala Phe Asn Met Pro Val Gin Gly Thr Ala Ala Asp Leu Met Lys Leu 
515 520 525 

Ala Met Val Lys Leu Phe Pro Arg Leu Arg Glu Met Gly Ala Arg Met 
530 535 540 

Leu Leu Gin Val His Asp Giu Leu Leu Leu Glu Ala Pro Gin Ala Arg 
545 550 555 560 

Ala Glu Glu Val Ala Ala Leu Ala Lys Glu Ala Met Glu Lys Ala Tyr 
565 570 575 

Pro Leu Ala Val Pro Leu Glu Val Glu Val Gly Met Gly Glu Asp Trp 
580 585 590 

Leu Ser Ala Lys Gly 
595 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 560 amino acids 

(B) TYPE: amino acid 
( D ) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu 

1.5 10 15 

Glu Ala Pro Ala Pro Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly 
20 25 30 

Ala Phe Val Gly Phe Val Leu Ser Arg Pro Glu Pro Met Trp Ala Glu 
35 40 45 

Leu Lys Ala Leu Ala Ala Cys Arg Asp Gly Arg Val His Arg Ala Ala 
50 55 60 

Asp Pro Leu Ala Gly Leu Lys Asp Leu Lys Glu Val Arg Gly Leu Leu 
65 70 75 80 

Ala Lys Asp Leu Ala Val Leu Ala Ser Arg Glu Gly Leu Asp Leu Val 
85 90 95 

Pro Gly Asp Asp Pro Met Luu Leu Ala Tyr Leu Leu Asp Pro Ser Asn 
100 105 110 

Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu 
115 * 120 125 

Asp Ala Ala His Arg Ala Leu Leu Ser Glu Arg Leu His Arg Asn Leu 
130 * 135 140 
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Leu Lys Arg Leu Glu Gly Glu Glu Lys Leu Leu Trp Leu Tyr His Glu 
145 150 155 160 

Val Glu Lys Pro Leu Ser Arg Val Leu Ala His Met Glu Ala Thr Gly 
165 170 175 

Val Arg Leu Asp Val Ala Tyr Leu Gin Ala Leu Ser Leu Glu Leu Ala 
180 185 190 

Glu Glu He Arg Arg Leu Glu Glu Glu Val Phe Arg Leu Ala Gly His 
195 200 205 

Pro Phe Asn Leu Asn Ser Arg Asp Gin Leu Glu Arg Val Leu Phe Aep 
210 215 220 

Glu Leu Arg Leu Pro Ala Leu Gly Lys Thr Gin Lys Thr Gly Lys Arq 
225 230 235 240 

Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro He 
245 250 255 

Val Glu Lys He Leu Gin His Arg Glu Leu Thr Lys Leu Lys Asn Thr 
260 265 270 

Tyr Val Asp Pro Leu Pro Ser Leu Val His Pro Arg Thr Gly Arq Leu 
275 280 285 

His Thr Arg Phe Asn Gin Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser 
290 295 300 

Ser Asp Pro Asn Leu Gin Asn He Pro Val Arg Thr Pro Leu Gly Gin 
305 310 315 320 

Arg He Arg Arg Ala Phe Val Ala Glu Ala Gly Trp Ala Leu Val Ala 
325 330 335 

Leu Asp Tyr Ser Gin lie Glu Leu Arg Val Leu Ala His Leu Ser Gly 
340 345 350 

Asp Glu Asn Leu He Arg Val Phe Gin Glu Gly Lys Asp He His Thr 
355 360 365 

Gin Thr Ala Ser Trp Met Phe Gly Val Pro Pro Glu Ala Val Asp Pro 
370 375 380 

Leu Met Arg Arg Ala Ala Lvg Thr Val Asn Phe Gly Val Leu Tyr Gly 
385 390 395 400 

Met .Ser Ala His Arg Leu Ser Gin Glu Leu Ala He Pro Tyr Glu Glu 
405 410 415 

Ala Val Ala Phe He Glu Arg Tyr Phe Gin Ser Phe Pro Lys Val Arg 
420 425 430 

Ala Trp He Glu Lys Thr Leu Glu Glu Gly Arg Lys Arg Gly Tyr Val 
435 440 445 

Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Asn Ala Arg 
450 455 460 

Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro 
465 470 475 480 
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Val Gin Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Ly s Leu 
485 490 495 

Phe Pro Arg Leu Arg Glu Met Gly Ala Arg Met Leu Leu Gin Val His 
500 505 510 

Asp Glu Leu Leu Leu Glu Ala Pro Gin Ala Arg Ala Glu Glu Val Ala 
515 520 525 

Ala Leu Ala Lya Glu Ala Met Glu Lys Ala Tyr Pro Leu Ala Val Pro 
530 535 S40 

Leu Glu Val Glu Val Gly Met Gly Glu Asp Trp Leu Ser Ala Lys Gly 
545 550 555 560 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: 
CGCCAGGGTT TTCCCAGTCA CGAC 



(2) INFORMATION FOR SEQ II) NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 24 base pairs 
{ B ) TYPE: nucleic acid 
(C) STRANDEDNES;>: single 
{ D ) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
AGCGGATAAC AATTTCACAC AGGA 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

( B ) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
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CTAAGTAGCT CCGATCCCAA C 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 2 5 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
ATCACTCCTT GGCGGAGAGC CAGTC 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
ATTTAGCACA TATGGCGATG CTTCCC 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 21 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 11: 
CTTTCCAGCT CCGACCCCAA C 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12 
CCTACTCCTT GGCGGAGAGC CAGTC 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 
TGGATGTCCC TCCCCTCCTG AAAGA 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 
CCCTTTCCCG GAAGCTTTCC CAGGTGCA 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15 
TGCACCTGGG AAAGCTTCCG GGAAAGGG 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: lxnear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
CCTGCAGTAC CGGGAGCTCA CCAAG CTCAA 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
TTGAGCTTGG TGAGCTCCCG GTACTG CAGG 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
TGGACTATAG CCAGATAGAG CT 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DIIA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
AAGCGAAGAC CTCCTCCTCG A 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 
AGTTCGGCAG CCTCCTCCAC GA 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
TCCAAGGAAA GCCTGAGGTC TT 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2 3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
AAGCTCGCCA TGGTGAAGCT CTT 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
TCGGAGACGA G TTGGTAG AG GT 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24 



ACCTCTACCA ACTCGTCTCC GA 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25 
AGAGGACGAA GCCCACGAA 



(2) INFORMATION FOR SEQ ID NO:26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
AGGAGGTAGG CGAGGAGCAT 



(2) INFORMATION FOR SEQ ID NO : 2 7 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
ATGCTCCTCG CCTACCTCCT 



(2) INFORMATION FOR SEQ ID NO: 28: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DMA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:28 



TCGAGGAGGA GGTCTTCGCT T 



(2) INFORMATION FOR SEQ ID NOj29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 9 
AGCTCTATCT GGCTATAGTC CA 



(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30 
ATAGGCTCTC CCAGGAGCTT 



(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31 
AAG AG CTTCA CCATGGCGAG CTT 
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{2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base paira 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA {genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO 



TTCCCCTGGA GGCGTTTCTG A 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
AAAGACCACG AAGACGGCCT T 



(2) INFORMATION FOR SEQ II) NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNES:; : single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
AAGGCCGTCT TCGTGGTCTT T 



(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 
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AAGGAGTGGG GAAGCCTGGA A 



(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

( B ) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 i near 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 



TTCCAGGCTT CCCCACTCCT T 



(2) INFORMATION FOR SEQ ID NO: 37 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 baae pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
TTCTTCCGAA GAGGGTTTCC A 



(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:38: 
GCGTCCAGGA GCGCCCTGAG GA 



(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39 
CCTCAGGGCG CTCCTGGACG CCA 



(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40 



TTCGTCCTCT CCCGCCCCGA 



(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 22 base pairs 
<B) TYPE: nucleic acid 

( C ) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41 : 
CCAACCTGCA GAACATCCCC GT 



(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNES:>: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 
GGTGTGGATG TCCTTCCCCT 



(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nuclexc acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 



WO 96/14405 



- 105 - 



(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:43 
CCCTGCCGTT TAGAGGAAGT TCAAG 



(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 25 base pairs 
( 3) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44 
CTTGAACTTC CTCTAAACGG CAGGG 



(2) INFORMATION FOR SEQ ID NO: 45: 

( i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:45 
ACCCGGCCTT TGGGTTCAAA GA 



(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

( B ) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46 
TCTTTGAACC CAAAGGCCGG GT 



(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 
TTCCCGTGCT CCTTCCGCTC 

(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 base pairs 
<B) TYPE: nucleic acid 
(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:48: 
CTCGCCTTCC TCGTGCCCTT 

(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: aingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 
GCTTCCGGCT CGTATGTTGT GTG 

(2) INFORMATION FOR SEQ ID NO:50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 70 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE; DNA ( genomic ) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 
GGAAAGCCTG AGGTCTTCCA TAGCTGTTTC CTGTGTGAAA TTGTTATCCG CTCACAATTC 
CACACAACAT 



(2) INFORMATION FOR SEQ ID NO: 51: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 81 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 
ACCCGGCCTT TGGGTTCAAA GAGCGGAACG ATCGCCTCCA TAGCTGTTTC CTGTGTGAAA 60 



TTGTTATCCG CTCACAATTC C 



81 
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What is Claimed is: 

1. A purified DNA in accordance with SEQ ID NO: 3, said 
DNA encoding for a member selected from the group consisting of a 
polypeptide in accordance with SEQ ID NO: 5, and fragments thereof having 

5 polymerase activity. 

2. The DNA of claim 1 consisting of nucleotides 112 to 1791 
of SEQ ID NO: 3. 

3. The DNA of claim 1 consisting of nucleotides 1 to 1791 of 
SEQ. ID NO: 3. 

0 4. A vector wherein the DNA of SEQ. ID NO: 3 is operably 

linked to a promoter. 

5. Plasmid p21EHcMl. 1, having ATCC Accession No. 69632. 

6. A host cell transformed with a DNA selected from the 
group consisting of the DNAs of claims 1, 2, and 3. 

5 7. The host cell of claim 6, wherein said host cell is capable 

of expressing a thermostable polypeptide encoded by said DNA, said 
polypeptide having DNA polymerase activity. 

8. The host cell of claim 7, wherein said host cell is a 
prokaryotic cell. 



9. The host cell of claim 8, wherein said host cell is an E. 

coli cell. 
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10. An expression vector operably linked to nucleotides 1 12 
to 1791 of SEQ ID NO: 3, said nucleotides encoding a polypeptide having 
thermostable DNA polymerase activity. 

11. The expression vector of claim 10 having at least one 
5 insert consisting essentially of nucleotides 112 to 1791 of SEQ ID NO:3. 

12. A purified fragment of Thermits flavus DNA polymerase 
I protein in accordance with SEQ ID NO: 5, said fragment having 
thermostable DNA polymerase activity. 

13. A fragment of Thermits flavus DNA polymerase I having 
10 thermostable DNA polyerase activity and consisting of amino acids 2 to 560 

of SEQ. ID NO: 5. 

14. A purified fragment of Thermus flavus DNA polymerase 
I protein encoded by the insert of plasmid p21EHcMl.l, having ATCC 
Accession no. 69632. 

15 15. The purified fragment of claim 14 wherein the fragment 

has a DNA polymerase activity between 60,000 U/mg protein and 600,000 
U/mg protein. 

16. A thermostable polypeptide having DNA polymerase 
activity, said polypeptide consisting essentially of the amino acid sequence of 
20 SEQ ID NO: 5. 
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17. A method for purifying a thermostable polypeptide having 
DNA polymerase activity comprising the steps of: 

transforming a host cell with a DNA to create a transformed 
host cell, said DNA encoding for a thermostable polypeptide having 
5 DNA polymerase activity and being selected from the group consisting 

of the DNAs of claims I, 2, and 3; 

cultivating said transformed host cell under conditions to 
promote expression of a thermostable polypeptide encoded by said 
DNA; and 

10 purifying said thermostable polypeptide with a monoclonal 

antibody that is immunologically cross-reactive with said thermostable 
polypeptide. 

18. The method of claim 17 wherein the host cell is 
transformed with the DNA of claim 3. 
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19. The method of claim 17 wherein said immunologically 
cross-reactive monoclonal antibody has specificity for a Thermus aquaricus 
DNA polymerase. 



20. A method of purifying a thermostable polypeptide having 
DNA polymerase activity comprising the steps of: 

a) expressing said thermostable polypeptide in a host cell, said 
polypeptide having an amino acid sequence encoded by a DNA selected from 
the group consisting of the DNAs of claims 1, 2, and 3; 

b) lysing the cell to create a suspension containing said 
thermostable polypeptide and host cell proteins and cell debris; 

c) contacting a soluble portion of said suspension with an 
antibody that is immunologically cross-reactive with said thermostable 
polypeptide and under conditions wherein the antibody binds to said 
thermostable polypeptide to form an antibody -polypeptide complex; 

d) isolating the antibody-polypeptide complex; and 

e) separating said thermostable polypeptide from said isolated 
antibody-polypeptide complex to provide a purified thermostable polypeptide. 

21. The method of claim 20 further comprising between steps 
(b) and (c) the steps of: 

heating said suspension to denature host cell proteins; and 
centrifuging said suspension to remove said cell debris and 
denatured host cell proteins. 

22. The method of claim 20 or 21 wherein said 
immunologically cross-reactive antibody is a monoclonal antibody. 

23. The method of claim 22 wherein said immunologically 
cross-reactive monoclonal antibody is specific for Thermits aquaticus DNA 
polymerase I. 
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24. The method of claim 22 wherein the purified thermostable 
polypeptide has a DNA polymerase activity between 79,500 U/mg protein and 
600,000 U/mg protein. 

25. The method of claim 22 wherein the purified thermostable 
5 polypeptide has a DNA polymerase activity between 217,600 U/mg protein 

and 600,000 U/mg protein. 
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3„ 3- g< g./ g- g^ go g^ g« g- g> So 

fc. s„ s. g< g- s- g- g^ 2- 5- 2- 3- 

g. g. g. g„ g- g„ g> g^ s. g. g, 

2- S> 3- S- 5> 5- 3-3" 3- S- 8- 

S> 2- 2* 2- 2- S> 2- 2- S- E3 3- 2« 
g„ g. g. g. g> g- 2- 2- 5- 2* 3 
2- 5- S> 2- 2- 2- 2- 3- £-5- 5- s- 

3 „ S. g. go g> go go 8. 3. S- g~ go 

s- 3- 3'- 3° 2- S> 3- 2- E- 2- 3° 
S- 2- 2^ 3° 2> 2- a- 3= g- 2* 2« 

g< go g. g- g^ go g-(|5 8- g< 8- 

2- 3° 2- 3-3- 3- 2- 2- 3- 2- 
g. gM - g. g>- 3, 2- a- 2= 3= 3- 2< 
a. 3 „ 8« »„ g. 3- g. g- g* g^ g- 

g- g- 2- 2- 3° 2- 2- 3- 3- 2- 2- 
2- g< 2- 5- 2- 2~ 2- s- 2- 2< a- 
g. 3. go g- s- go g. g. 

g. g. 3, g„ g< 3- g~ 3- 2- 2< 3- 

3= 2- 2* 2- 2- 3- 2- 2- S- 3- 2- 

3, s- 3- g- g. o, 3„ g, g. 5- 3. 

g„ 3, 3. 3- g. go 3, go g. 3, 

2 2 go g« g- go g. 3. 3- 3- g> go g. 

g, 2- 3- 2- 2* 3- S> 5- 3- 2- S- 
g. go gx g- g~ g* go 3, g. g~ g. 
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5- 2- S> 5- 2> s- 2> S- 3" 3- 5* 5> I 
3= s „ a. s. g. a, .». g> s- g-5 

1= 5« 3 = 5- 3- §< 2- 5- 3- 3- 2- 

s„ a~ a. a- 5- S- 2* S- a~ 5- c- 

g~ gj e- 'a, g- g^ g^ g^ 3- g> »- 

s. g« a = g» g* 5° S« S- 2- 3" 

5- i- 5- S- s- 2- 3= 2- S> 3- 2- 5* 

3 = §- 5- 2- S- S» S- p- 3- 3- 3- 

■g- g'.- 3- 3- 3- 3- 2- 5- 3« 3- S> S- 

s- a. a- §« g^ g^ 2- S- g- 3- a- 3- 

3- g- 3- 2> 3- 2- 3- a- 3-' 3- 3- 2- 

3- g- 3= ^ g- 3- s-. g< g„ g< g- _ 

2- g* 3- 5- 5- 3- 5- a- 3- 5* 2« 2< := 
2< 5=- 5- 3- 3= 2- 3- 2- g< 2- 2- 2» 

s„ g- S > g~ a- g~ g- g> g- g- g- 

g> 3- 3- 2- 2- 2- 2- 2< 2- 2- 2* 3- 

g< g. g- g- g- a-. 3- 2- g« g, s- 

g^ g- g^ g- 2- a- S- 3- s- a- g^ a, 

3= 2- 2^ 3- 2- 3- 3= 3- 2- 3-3= 3- 

S- 2* 2- 2- 5> 2- 3°- 2- '2~ 2- 2> 2~ 

g- 2- 5= S- 3- 5- 2- 2- 2- 2- 2- 3- 

5- 2^ 2- S- 3- 2- 2- 2- g- 2- 3- 

•5- 3° a- 5- 3- 2- 2- 3- g» a. g. a- 

g- g. a~ g> g- E- 3- 2- g^ 2« 2- S- 
„ g. g. g- g«;3- 2> 2- 2- 2- 2> 2- 

' g, g- s« g« B- 2- 2- g~ g> *- S- 

3- g- g- 2- 2- 3- 2^ 2- g- g- 2- a- 
gj g< g. g~ 3° a. 8- g, g. g» g= 
g. g. a- a. g- s. g. a. g. g, 

go g, g „ g« g^ 3- a, s _ s _ s. g M 
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10 20 30 

tacttcqqcg gggtgaagct cggggccggg gggcttgtgc gggcctacgg gggggtggcg 
atgaagclgl III alt tig a gllclgglcc cccgaacacg cccggatgcc cccccaccgc 

70 80 90 10® 110 FTFLZ 120 

qcggaggcct taagcgggcg cccaaggtcc ccttggtgga gcgggtgggg [ ctcqccttcc 
?gcct?cgga att?gcccgc gggttccagg ggaaccacct cgcccacccc gagcggaagg 

130 140 150 160 170 180 

tcatoccctti cgccgaggtg ggccgggtct acgccctcc.t ggaggcccgc gccctgaagg 
^^g^ glggltllal ??ggcccaga tgcgggagga cctccgggcg cgggacttcc 

190 200 210 220 230 240 

ccgaggagac ctacaccccg gagggcgtgc gcttcgccct cctcctcccc aagcccgagc 
ggctlltltg gatgtggggc ctcccgcacg cgaagcggga ggaggagggg ttcgggctcg 

250 FT FLQ 760 270 280 290 F miQ 300 

aaaaaagttt lectcaoaqc q ctcctggacg ccqfc ccgggg acaggtggcc c^ggagtagc 
ggSSUfl HQg agiisi iSBBgfSE ^ >W CCC tgtccaccgg gacctcatcg 

^10 TFLP 320 FTTL X 330 340 350 360 

^TdGAGGCGA TCGTT C^ dT_CIIIGAAcLc AAAGGCCGGG fe TCCTGGT "ACGGCCAC 
Vj^rjrrr.rr a r. r & Ar. r.C CAAACTTGG O TTTCCGGCCCAj GGAGGACCA CCTGCCGGTG 

OTPl W 



50 60 



RTFLW 

a 70 390 400 410 420 

CACCTGGCCT ACCGCACCTT CTTCGCCCTG AAGGGCCTCA CCACGAGCCG GGGCGAACCG 

gtggaccgga tggcgtggaa GAAGCGGGAC ttcccggagt ggtgctcggc CCCGCTTGGC 

430 440 450 460 470 480 

GTGCAGGCGG TCTACGGCTT CGCCAAGAGC CTCCTCAA GG ^fTGAAGGA ^ACGGGTA C 
CACGTCCGCC AGATGCCGAA GCGGTTCTCG G A G GA GTTC|C__GG G A CTTC CT CCTGCCCAT|G 

510 520 530 540 



IaTg^CCGTCT TCGTGGTCTrjl GACGCCAAG GCCCGCTCCT TCCGCCACGA GGCCTACGAG 

ttccggcIgI IgcIccagaI aK tgcggttc CGGGGGAGGA aggcggtgct ccggatgctc 



RTFLK trcn 560 570 580 590 600 

gcctacaagg cggggagggc cccgaccccc gaggacttcc cccggcagct cgccctcatc 
cggatgttcc gcccctcccg gggctggggg ctcctgaagg gggccgtcga gcgggagtag 

620 630 640 650 660 

AAGGAGCTGG TGGACCTCCT GGGGTTTACC CGCCTCGAGG TCCCCGGCTA CGAGGCGGAC 
TTCCTCGACC ACCTGGAGGA CCCCAAATGG GCGGAGCTCC AGGGGCCGAT GCTCCGCLifc 

(Continued ) 



Figure 4 (i) 
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(Continued) 

670 680 690 700 710 720 

GACGTCCTCG CCACCCTGGC CAAGAAGGCG GAAAAGGAGG GGTACGAGGT GCGCATCCTC 

CTGCAGGAGC GGTGGGACCG GTTCTTCCGC CTTTTCCTCC CCATGCTCCA CGCGTAGGAG 

730 FTFLB 740 750 760 770 780 

ACCGCCGACC GC dA^tttA (tCAACTCGTC TCCG AjCCGCG TCGTCGTCCT CCACCCCGAG 



TGGCGGCTGG CGC TGGAGAT GGTTGAGCAG AGGCljGGCGC AGCAGCAGGA GGTGGGGCTC 



790 RTFLA 800 810 820 830 840 

GGCCACCTCA TCACCCCGGA GTGGCTTTGG GAGAAGTACG GCCTCAAGCC GGAGCAGTGG 

CCGGTGGAGT AGTGGGGCCT CACCGAAACC CTCTTCATGC CGGAGTTCGG CCTCGTCACC 

850 860 870 880 890 900 

GTGGACTTCC GCGCCCTCGT GGGGGACCCC TCCGACAACC TCCCCGGGGT CAAGGGCATC 

CACCTGAAGG CGCGGGAGCA CCCCCTGGGG AGGCTGTTGG AGGGGCCCCA GTTCCCGTAG 



910 920 3 30 FTFLM 940 950 960 

GGGGAGAAGA CCGCCCTCAA GCTCCT( fAAG"~GAGTGGGGAA G CCTGGAA AA CCTCCTCAAG 
CCCCTCTTCT GGCGGGAGTT CGAGGA uTiT CTCACCCCTT CGGACCTT] TT GGAGGAGTTC 



RTFLN _ 
970 980 990 1000 I 0 ! 0 1020 

AACCTGGACC GGGTAAAGCC AGAAAACGTC CGGGAGAAGA TCAAGGCCCA CCTGGAAGAC 
TTGGACCTGG CCCATTTCGG TCTTTTGCAG GCCCTCTTCT AGTTCCGGGT GGAqqTTCTG 
ptfi 1 s 1040 1350 1060 1070 1080 

rTf a Iggcttt ^cgaSc? ctcccgggtg ccJ caccgacc tccccctgga ogtggacctc 

GAGllcCGAAA GdAACCTlCGA G AGGGCCCAC GC] GTGGCTGG AGGGGGACCT CCACCTGGAG 



TFLER1 1 aaa RTFL1 6 1 100 1110 1120 1130 TFLEFT1140 

rrrcLCCGGC GGGAGCCCGA CCGGGAGGGG CTTAGGGCCT TCCTGGAGAG GCTG4AGTTC 
CGGGTCCCCG CCCTCGGGCT GGCCCTCCCC GAATCCCGGA AGGACCTCTC CGACCTCAAG 

n^ft 1160 1170 1180 HQ 0 I 200 

sAk ssssssa vtsssz sssss ssssss 

OTCI f* 



1280 RTFLC 1290 1300 1310 „1™ 



rrrrAGCTTA AAGCCCTGGC CGCCTGCAGG GACGGCCGGG TGCACCGGGC AGCAGACCCC 

cgcctcgaIt ttcgggIccg gcggacgtcc ctgccggccc acgtggcccg tcgtctgggg 

(Continued) 

Figure 4 (ii) 
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(Cont inued ) 

n*fl 1340 1350 1360 1370 1380 

HSSSSSS I^S SSSES ESSES SESSE 

mcccSS acacct 1 ^ CCCGGGGaI? ACCC ^ gggg 

AACCGGAGCT CCCTCCCCGA TCTGGAGCAC GGGCCCCTGC TGGG GfTACGA GGAGCG6A1G 



RTFLD 

. 1Afifl 1470 1480 1490 1500 

TttFtIgGACC CCTCCAACAC CACCCCCGAG GGGGTGGCGC GGCGCTACGG GGGGGAGTGG 
ISI^tgg ggIggttgtg GTGGGGGCTC CCCCACCGCG CCGCGATGCC CCCCCTCACC 

*tz*fx -\K->a 1530 1540 1550 1560 

srcrLCclll CCGCCCACCG GGCCCTCCTC TCGGAGAGGC TCCATCGGAA CCTCCTTAAG 
TGCCTCCTGC GGCGGGTGGC CCGGgIgGAG AGCCTCTCCG AGGTAGCCTT GGAGGAATTC 

+ *-„x nan 15Q0 1600 1610 1620 

CGCCTCSS GGGAGGAGAA CCTCogfS CTCTACCACG CCCCCTCTCC 
GCGGAGCTCC CCCTCCTCTT CGAGGAAACC GA6ATGGTGC TCCACCTTTT CGGGGAUA^ 

1640 1650 1660 1670 1680 

USSSS GGGTGTACCT CCGGTGGCCC SRSffi SSSSS* SgIcG^'g 

CTTTCCCTGG AGCTTGCGGA GGAGATCCGC Cgg g^gSt 

GAAAGGGACC TCGAACGCCT CCTCTAGGCG GCGG^GjCTOC ^TCCAGAA U^GAA| CCGC 



RTFL18A ig00 

sil bssS = asS sis kssssk 

sszli ssIP = ss&i = sasSS 
T.« s «;s g . ? «2s s g^s m| 

GACCTCCGGG ATGCCCTCCG GGTGGGGTAG CACCTCTTCT A ^GGAGGTCGT GGCCCTCCAfc 



RTFL18 

™»*s^ Sfll sssS = = 



TGGTTCGAGf tj cTTGTGGAT GCACCTGGGG GAGGGTTCGG AGCAGGTudQ <L£^<. 

CCCCTcSS CCCGCT^ CCAGACGG^ ACGGCC^ CGAGG|f^|i 
GCGGAGGTGT GGGCGAAGTT GGTCTGCCGG TGCCGGTGGC CCTCC GAATC ATCGAGGCTG 

(Continued) 

Figure fl(nt) 
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2050 FTFLS 2060 2070 2080 2090 2100 

dCCAA QCTGC AGAACATCCC CGTfc CGCACC CCCTTGGGCC AGAGGATCCG CCGGGCCTTC 
GGGTTGGACG TCTTGtAG"GT'GTAGGCGTGG GGGAACCCGG TCTCCTAGGC GGCCCGGAAG 

2110 2120 2130F TFL17A2140 2150 2160 

GTGGCCGAGG CGGGATGGGC GTTGGTGGCC dTGCACTATA GCCAGATAGA GC1 CCGCGTC 



CACCGGCTCC GCCCTACCCG CAACCACCGG qACCTGATAT CGGTCTATCT CGAjGGCGCAG 



RTFLG 

2170 2180 2190 2200 2210 2220 

CTCGCCCACC TCTCCGGGGA CGAGAACCTG ATCAGG GTCT TCCAGGAGGG GAAGGACATC 
GAGCGGGTGG AGAGGCCCCT GCTCTTGGAC TAGTCCQAGA AGGTCdTCCC CTTCCTGTAG 

RTFL1 3 RTFLT 

2230 2240 2250 2260 2270 2280 

CACACCCAGA CCGCAAGCTG GATGTTCGGC GTCCCCCCGG AGGCCGTGGA CCCCCTGATG 

'gITgTgIIgtct ggcgttcgac ctacaagccg caggggggcc tccggcacct GGGGGACTAC 

2290 2300 2310 2320 2330 F TFLH 2340 

CGCCGGGCGG CCAAGACGGT GAACTTCGGC GTCCTCTACG GCATGTCCGC C dATAGGCTC 
GCGGCCCGCC GGTTCTGCCA CTTGAAGCCG CAGGAGATGC CGTACAGGCG GGTATCCGAG 

2350 2360 2370 2380 2390 2400 

TCCCAGGAGC"ril GCCATCCC CTACGAGGAG GCGGTGGCCT TTATAGAGCG CTACTTCCAA 
AGGGTCCTCG AACGGTAGGG GATGCTCCTC CGCCACCGGA AATATCTCGC GATGAAGGTT 

2410 2420 2430 2440 2450 2460 

AGCTTCCCCA AGGTGCGGGC CTGGATA GAA AAGACCCTGG AGGAGGGGAG GAAGCGGGGC 
TCGAAGGGGT TCCACGCCCG GACCTATCTT TTCTGGGACC TCCTCCCCTC CTTCGCCCCG 

2470 2480 2490 2500 2510 2520 

TACG TGGAAA CCCTCTTCGG AAGAAG GCGC TACGTGCCCG ACCTCAACGC CCGGGTGAAG 

atg^aTcTtt gggagaagcc ttctt)ccgcg atgcacgggc tggagttgcg ggcccacttc 

RT 2530 2540 2550 2560 2570 2580 

AGCGTCAGGG AGGCCGCGGA GCGCATGGCC TTCAACATGC CCGTCCAGGG CACCGCCGCC 

TCGCAGTCCC tccggcgcct cgcgtaccgg aagttgtacg ggcaggtccc gtggcggcgg 

25 90 FTFLSF1 2600 2610 



_ 2620 2630 2640 

GACCTCAT dT AGCTCGCCAT GGTGAAGCTTttI cCCCCGCC TCCGGGAGAT GGGGGCCCGC 



CTGGAGTACT TCGAGCGGTA^ ccacttcgag aajgggggcgg aggccctcta cccccgggcg 



2650 2660 2670 2680 2690 2700 

ATGCTCCTCC AGGTCCACGA CGAGCTCCTC CTGGAGGCCC CCCAAGCGCG GGCCGAGGAG 

TACGAGGAGG tccaggtgct gctcgaggag GACCTCCGGG gggttcgcgc ccggctcctc 



(Continued) 

Figure (iv) 
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(Continued ) 

2710 2720 2730 2740 2750 2760 

GTGGCGGCTT TGGCCAAGGA GGCCATGGAG AAGGCCTATC CCCTCGCCGT GCCCCTGGAG 

CACCGCCGAA ACCGGTTCCT CCGGTACCTC TTCCGGATAG GGGAGCGGCA CGGGGACCTC 

2770 2780 2790 2800 2810 FTFLU 2820 

GTGGAGGTGG GGATGGGGGA GGACTGGCTT TCCGCCAAGG GltTAGfa gggg dcctgccgtt 



CACCTCCACC CCTACCCCCT QCTGACCGAA AGGCGGTTCC CA/Tg ftcccc gfagqcggcaa 



RTFL4/ RTFL1 2 RTFLV 
2830 2840 2850 2860 2870 2880 



tagaggaagt tcaag gggtt gtcc ctcaqa aacgcctccq ggggaa cgcc ctctgcggct 
atctccttca aq-fEq cccQa cagg gjagtct ttgcggoggt ccccttj gcgg gagacgccga 

~ ^ "~ RTFLJ 

2890 2900 2910 2920 2930 2940 

accaggaggc ctttagcccc aaaggtgcgg gtgaaggctt ccaggccctg ggttctttta 
tggtcctccg gaaatcgggg tttccacgcc cacttccgaa ggtccgggac ccaagaaaat 

2950 2960 2970 2980 2990 3000 

aagggggcgc ttttgacctc gagggccagg aggcgctttc ccttttgaag gacaaagtca 
ttcccccgcg aaaactggag ctcccggtcc tccgcgaaag ggaaaacttc ctgtttcagt 

3010 3020 3030 3040 3050 3060 

cttcctggtc cctttcccgc cagtagtaca cctcaaaccc cccctggt 

gaaggaccag ggaaagggcg gtcatcatgt ggagtttggg ggggacca 



Figure 4(v) 
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Figure 10A 
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that lacks 274 amino acids from the N-terminus of the approximately 94,000 dalton T. flavus DNA polymerase I, and to the protein encoded 
thereby which has been designated the T. flavus DNA polymerase I exo fragment. The enzyme fragments are useful in DNA sequencing, 
Thermal Cycle Labeling, Polymerase Chain Reaction, and other molecular biological applications. 
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